How to Use Voting Portal?

Search for any Entity

Search for any entity you have knowledge/interest in. Start typing to see list of entities superstring to current input. Choose from about 6 million (en) entities across entire Wikipedia.

Pick a Surface Name

The landing page shows correct surface names as well as incorrect surface names for used all across Wikipedia for the selected entity. Pick incorrect surface name to proceed.

Decide

View the mention sentence or visit the Wikipedia source page to understand reference and decide if the selected surface name is indeed incorrect. If yes, what can be a correct replacement.

Vote

Vote, based on observation, whether the mention surface name is incorrect or not. In case of incorrect, further vote for a proper replacement. The replacements doesn't need to be case-sensitive.

The Project

We made an attempt to automate the process of identifying the incorrect surface names, and classify them into different kinds. We came up with a general heuristic on classification of correct and incorrect, and at the same time suggested corrections based on Edit-Distance and Superstring Substring heuristic.

Methodology

We use frequency and Edit-Distance measures to classify and label the surface names. The different kinds of errors existing in surface names are typing errors, incorrect entity, vandalism. After classification, we find appropriate replacements for the incorrect surface names, by replacing them with our suggested candidate correct surface names. For analysing the quality of the corrections, we developed a web portal for users to manually check the suggested corrections for wrong surface names, and develop a Wikipedia bot to automate the process. The bot pushes all updates once everyday.

Read more

Current Structure

Contact Us

For the dataset and codebase contact us at : awekar@iitg.ac.in