Evaluate and recommend a NLP Toolkit

Evaluate and recommend a NLP Toolkit

Closed - This job posting has been filled and work has been completed.

Job Description

I'd like to choose a NLP toolkit to use on an ongoing basis at my company.

This project is to evaluate a number of NLP toolkits, and make recommendations as to the advantages/disadvantages of them against my criteria.

Our uber-high-level goal is to extract structured knowledge about products (say cameras, tvs, or movies) from the web, so we can help people decide which one is best for them.

Requirements:
- Easy to develop against with Java
- Open source (doesn't have to be free though)
- Has to be an SDK, not a SAAS online API

Criteria to evaluate each toolkit against:
- How frequently are releases (more frequently is better)
- Are bugs fixed quickly
- Are new features and improvements being developed
- How popular is it within the research community
- How popular is it within industry
- How active is the developer mailing-list or forum
- How many URLs does google have indexed that are related to the toolkit
- Is the licensing compatible with our preferences (no source code sharing, no data sharing)
- Learning curve: How easy is it to use and learn
- Deployment: How easy is it to use and deploy in a JVM environment

NLP tasks in the short term:
- entity extraction (what product is this consumer review talking about)
- extracting quotes from reviews (e.g. "This camera's video is amazing)
- sentiment analysis (is the quote positive or negative)
- topic classification (is this quote talking about video capability or image quality?)

NLP tasks in the long term:
- parsing product specs and ratings from unstructured text
- parsing user queries like "Best digital cameras" and turning them into executable queries
- review quote similarity/clustering to discover topics
- machine learning to find helpful reviews

Background:
- We have a development team of 8 people, expanding to 15
- We're just getting started doing some NLP, we hope to do a lot more, say 4 developers full time

Deliverables:
- A list of toolkits considered
- A list of toolkits that were ruled out completely
- A detailed comparison of the top 5 toolkits against our criteria, and short term and long term NLP tasks
- A recommendation

Notes:
- Please include Python NLTK in the comparison

---
Skills: nlp, research, analysis