This projects differs from Sopan's project because it uses two unzippers that are relatively more impervious to zip corruption than PowerPoint. PPTX files seem to have zip corruption as their biggest cause of corruption. The two different versions of the converter using the two different unzippers sometimes succeeds in extracting text from corrupt pptx files where PowerPoint 2007 and 2010 FAIL.
Furthermore the programs use regular expressions to Extract the text from XML files rather than judging from the error messages, the strict xml libraries that PowerPoint appears to be using. So sometimes when PowerPoint signals an unreadable XML error, these programs still extract useful text.