New algorithms help you assess and organize scientific literature in automatic mode
Material posted: Publication date: 27-04-2013

To remain at the cutting edge of modern science, to be aware of all the last of its achievements is today an extremely complex task even in any specialized area. Daily media outlets publish thousands of scientific articles, research results and papers to read, much less assimilate the information contained in them, not a single person in the world how clever he is.

To address the problem of huge amount of information, researchers from North Carolina state University have developed a computer program that automatically can evaluate, organize and sort all the scientific literature and publications, giving them links only to the most relevant and reliable sources of information.

Using algorithms of deep text analysis program lays out the priorities for further reference all research work that falls within its field of view. Selected works are placed in a specialized thematic database, such as database Comparative Toxicogenomics Database (CTD), a database with open access, which contains information on the influence of various chemical agents on the genome of the human body that affects the health of present and will affect the health of future generations.

"One topic of the health effects of toxic heavy metals from 1926 was published more than 33 thousand scientific papers" explains Dr. Allan Peter Davis (Dr. Allan Peter Davis), one of the leaders of the CTD project, "Even if we maximize our efforts we will not be able to read the entire volume of these works and to choose from them only the most important information. Fortunately, now, it successfully can make our new algorithms".

As mentioned above, to select the most important publications used algorithms in-depth analysis of the text. The program compares the texts of many thousands of articles, determines the coincidences and facts that is expressed in a digital measure of scientific "weight" of each document on the basis of which the decision on inclusion in a common database. "The algorithm does not work with this particular article, he can work effectively with large sets of articles, but in this case it is very reliably separates the wheat from the chaff, so to speak," says Thomas Vigers (Thomas Wiegers), one of the researchers in the field of bioinformatics.

To check the operability of the developed algorithms the researchers selected 15 thousand articles and sent out the team of qualified reviewers, who after finding they had to choose the most important documents. "The results were impressive," says Dr. Davis, "the Reviewers made a choice that matched 85% with a selection made by the computer. Only the computer made my choice much faster people."

The use of estimation algorithms of scientific articles will allow scientists to save time and raise your work efficiency by at least 30 percent. "This technology will allow you to save huge amount of precious time," explains Dr. Davis, - "Thanks to our technology we can more effectively use the resources of entire research teams, enabling the scientists to work only with documents containing the maximum possible amount of relevant information".

Naturally, as in any algorithm, the algorithms evaluating the scientific literature there are anomalies, when a high rate is assigned to the article that person is the reviewer dismisses as not important. The researchers conducted a thorough analysis of the texts of "abnormal" articles and identified the reasons why the program took a wrong decision. "Now we can make corrections in our algorithms, after which the system will work as accurately as possible".

"We are still far from the stage where the computer will be able to read literature, publications, and work independently only extracting all the important data and presents it in readable form," says Davis, "But we have realized a deep analysis of the text is a big step forward in this direction."


Tags: innovation , science