IUPR Research Group (DFKI and UniKL)

Research

Pattern Recognition and Machine Learning

Despite large numbers of papers written on pattern recognition and machine learning and numerous experiments purporting various methods to be superior to various other methods, the choice and application of pattern recognition and machine learning algorithms still remains a black art.

The goal of our projects in pattern recognition is to bring sound mathematical and scientific principles to bear on evaluating and comparing pattern recognition and machine learning algorithms, to characterizing data sets, and to creating tools that automate the selection, deployment, and testing of pattern recognition method.

OCR, Information Retrieval, Digital Libraries


Document analysis deals with the visual and geometric analysis of document images. The goal is to recover textual content, geometric structure, and logical structure. With the recent resurgent interest in digital libraries and large-scale scanning operations by organizations like Google, Microsoft, and the Internet Archive, document analysis has become a very important real-world problem again. We are addressing document analysis at all levels: camera-based document and book capture, OCR and handwriting recognition, document retrieval, and document enhancement. In addition to its practical applications, document analysis is also an important test cases for more general computer vision and machine learning algorithms due to the availability of large amounts of correctly ground-truthed data.

OCR and Layout Analysis: OCRopus project OCRopus demo page layout analysis demo

Camera-Based Document Capture: OSCAR camera-based document capture demo document dewarping demo

Content Analysis and Information Extraction: appearance-based document retrieval demo bibliographic reference recognition demo

Computing for the Humanities: historical document analysis/comparison demo

Additional demos are listed on the IPeT Demo Page

For a general overview, please see The OCRopus Open Source OCR System


Document Image Security and Document Forensics


Paper-based documents are widely used for identification, authentication, and legal purposes. Forgery of such documents is a major component of insurance fraud, immigration fraud, tax evasion, and other white collar crime. Although optical security measures like holograms and special paper are partial solutions in areas such as currencies and passports, they are expensive and are not applicable when the creation of the document is not under the control of the organization needing to verify the documents. We are developing techniques that allow the authenticity of ordinary paper documents to be verified using optical techniques.

Image and Video Content Analysis


Large amounts of images and videos are captured in numerous context: consumer digital imaging, surveillance, industrial inspection, satellite imagery, astronomy, and many other areas. We are applying image processing, pattern recognition, and machine learning techniques to problems such as the detection of anomalous behaviors, defect detection in industrial inspection, quantitative analysis of large amounts of astronomical image data, and media asset management.

Whitepaper German English


Intelligent Network Security

Today, network security relies largely on systems techniques like secure protocols and rule/pattern-based methods.



We are applying statistical, decision theoretic, pattern recognition, and machine learning techniques to the automated and adaptive analysis of network traffic. We focus on

  • Identification and remediation of DDoS attacks and intrusion attempts (zero-day exploits)
  • Behavioral analysis and anomaly detection
  • Traffic modeling and forecasting in networks
  • Early warning in critical infrastructures
For more information, see http://netsec.iupr.com/