Extractor
 

Extract the pure essence, the pure meaning of a subject matter, of a document, an ebook or a series of search engine results by surfacing their key words and key phrases - automatically. 

Used by search engine optimization (SEO) and document management companies alike, the Extractor summarization technology reads a document, much like a human being does, returning lists of the keywords and key phrases accurately weighted as they are found in that document, text or web page.

Uniquely positioned for web services, Extractor is immediately capable of consuming documents of any length and subject matter, distilling the precise, contextual meaning of the target content into keyword and key phrase summary formats. Extractor's unique patented technology delivers precise content summaries in any subject domain without retraining and without human intervention.

Why is Extractor Different : The Extractor technology uses a patented set of core algorithms to extract (read) keywords and key phrases from any text-based document. The patented process allows Extractor to maintain the context in which the subject matter has been expressed and one of the primary differences. In essence a machine learned method for reading (summarizing ) any document of any subject matter written in an electronic text format. The distinction of machine learned is important in contrast to other methods of summarization discussed in the market place, which typically use referential and / or probability based theories for trying to determine content. These contrasting approaches require that their algorithms be trained and retrained per specific subject domain (non-machine learned) and are usually English language based. The Extractor Technology does not require per subject training and retraining allowing it to be used across any set of documents in one or a mix of subject domains and in any of six international languages (English, French, German, Japanese, Korean and Spanish).

Platform agnostic : At its core the Extractor component software is designed for developers using the most common computing and design platforms which today include Windows, Linux and Solaris. The architecture of the Extractor Technology does allow for custom system compiles as need arise. The commercial software development kit (SDK) includes development support for C, VC++, C#, Java, Perl, Python (Visual Studio and Visual Studio .NET).

Summarization : Extractor is exceptionally good at web content summarization incorporating its patented technology to summarize text, e-mail and html content into weighted lists of keywords and key phrases extracting the primary contextual sentence highlight of how the keyword / key phrase has been used. Uniquely positioned for web services, Extractor is immediately capable of consuming documents of any length and subject matter, distilling the precise, contextual meaning of the content into keyword and key phrase summary formats. Extractor's unique patented technology delivers precise content summaries of any subject domain without retraining and without human intervention.

Relevant Information : Not just information but contextually accurate, relevant information is a critical tool for the success of business today. Being able to source relevant information in context of the subject matter is what gives organizations an ultimate competitive advantage. Rather than working through traditional, time consuming, iterative search engine processes, incorporating Extractor into Enterprise systems empowers corporate information with relevant and meaningful representations meeting the needs of today's social workforce.

For an interactive demonstration of how Extractor can provide relevant information and assist knowledge workers please see http://www.picoFocus.com

 
 
Copyright � 2009, DBI Technologies Inc.
   
   
  Feature Highlights
  Keyword extraction
  Key phrase extraction
  Automatic ranking
  Ascending/ descending order
  Sentence highlight
  Go phrase (find)
Stop words (ignore)
   
  Extractor
  Buy ItTry It
   
  Great for
  workforce optimization
  weblog tagging
  refined search
  knowledge management (KM)
  information retrieval (IR)
  semantic web development
  indexing
  categorization
  cataloguing
  inference engines
  document management
  portal services
  speed reading
  trend analysis
   
  Technical Details
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
  The World of Relevant Information
in the Palm of Your Hand