It is planned to do the following:
1. Regularly crawl internet to obtain materials published on the topic of interest from selected web based sources (like news websites, blogs etc)
2. Displaying of crawled news items date wise/source wise/topic wise/country-wise before the users
Scope of Work:
1. This project will extract the news and blogs being written on the selected topics of DRDO’s concern from internet. For this it has to crawl the web periodically (frequency not less than once in 2 hours)
2. The websites to be crawled will be (but may not be limited to) the following:
The Times of India
The Asian Age
The Hindustan Times
Washington Post (US)
The New York Times (US)
Voice of America
Daily News (PAK)
Beijing Bulletin (China)
China Daily (China)
The SUN (UK)
Daily Mail (UK)
Israel National News (Israel)
Israel Herald (Israel)
The Korea Times (S. Korea)
The Korea Herald (S.Korea)
The Moscow Times (Russia)
Wall Street Journal
3. The news items crawled from the above mentioned website will pertain to (but may not be limited to) the following topics/keywords:
DRDO products (missiles, armaments, tanks, , UAV, LCA, AWACS etc)
Armed forces and self reliance
Defence agreements and relationships between India and neighboring countries like Pakistan, China, South Korea, UK, USA, Israel
Nuclear weapons and nuclear powers
4. The crawled news items should be able to be hosted on a web server and users across the globe should be able to access it online.
5. The software should display the results date wise, in a chronologically descending order. By default the current dated news will be displayed.
6. There should be provision of advanced search through which the news articles should be filtered date wise, source wise and topic wise or country wise combination of any of these.
There can be following combinations:
Given a specific time range and topic, list all articles corresponding to a given topic over a particular time period
Given a time range and country of publication, list all articles for all topics.
Given a list of countries, list all articles country-wise
Given a country and topic, list all articles published in that country on that topic.
Given a time range, list all topics on which new information has been uploaded on web.
7. User should be given the option of selecting all or some newspaper websites, blogs, countries, topics etc to perform search
8. There should be an archive section which will store all the news articles of previous years date wise.
9. The software should be able to generate log reports like source generating maximum news over a period of time or a topic generating maximum news over a certain period of time etc.