Natural language processing services

We are have extensive experience with natural language processing in Japanese and a range of Western languages. We use a combination of open source technologies and our own technologies (ATA), corpora and other lexical assets in our text processing work.

We can help with

  • Large scale text mining and web crawling
  • Corpus tagging and machine learning
  • Named entity extraction and entity normalization
  • Document clustering and keyphrase extraction
  • English ↔ Japanese transliteration (or vice versa)
  • Text parsing, cleansing and normalization
  • Sentiment analysis (experimental)
  • Search log data mining
  • Social media analytics

Please do not hesitate contacting us if you need help with a different NLP problem.

ATA is our own framework for text analytics.

ATA contains high-quality and reusable components for a range of NLP tasks, many which are based on state-of-the-art research within academia.


Major functional areas include:

  • An end-to-end machine learning tool-chain for information extraction, labeling, etc.
    • A modern web interface for easy corpus annotation
    • A suite of machine-learning tools and algorithms
    • Visual tools for model validation and error analysis
  • Various text mining tools for query logs
  • Various text mining tools for Twitter messages
  • Information extraction tools for Wikipedia and other corpora

ATA is particularly useful for processing Japanese text, but it is also useful for many other languages as well.

Source code license included

We include a full source code license to the ATA components we use in our NLP projects.