Bibliographic Computer Science Indexing Review with Disease Covid 19

- Researchers in conducting their research use the search using the homepage of the publication, according to expertise, collaboration in research, and research interests. And at this time the Covid 19 pandemic, became a trending topic for researchers, in various scientific fields. This study classifies based on publications located on the homepage source namely Scopus and Google Scholar, by analyzing the following topics, namely Natural Language Processing, Text Mining, Remote Sensing, and Sentiment Analysis using Name Entity Recognition to detect and classify named entities in text and using occurrence and link strength methods. The results showed science index literature about diseases Covid 19, obtained that Scopus has the most equitable percentage, has a good occurrence and link strength among the five scientific fields, namely Natural Language Processing 23.81%.33%, Text Mining 19.05%%, Remote Sensing 0 %, Sentiment Analysis 57.14 % then Google Scholar Natural Language Processing 51.35%, Text Mining 0 %, Remote Sensing 48.65 %, Sentiment Analysis 0 %


I. I. INTRODUCTION
Coronavirus disease (COVID 19) was first discovered in Wuhan China at the end of 2019 [1]. This type of virus is highly contagious and spread rapidly in various parts of China but also in Japan, Thailand, and South Korea in less than 1 month through respiratory droplets and close contact [2]. This is a significant threat to the global health of millions of lives worldwide [3] causing acute respiratory system disorders so it was officially declared by the World Health Organization (WHO) as a global pandemic on March 11, 2020 [4]. Of course, this pandemic has caught the world's attention because of its uncontrolled spread causing a spike in cases to increase [5]. It was reported that as of April 30, 2020, there were 3.2 million confirmed cases with a total of 227,847 deaths in 185 countries [6]. Because the spread of this virus is very high and uncontrolled, a large number of studies have been carried out and have been published [7] for free to speed up research and assist the government in responding to the crisis [8]. Therefore, it is important to evaluate the literature with quantitative and qualitative values to obtain literature patterns. and identify gaps and use the results.
Unstructured data in the form of entities, relations, objects, events, and many other types are a process extracted from Information Extraction, to improve the data analysis in the form of entities, objects, relations, and the perspective of streaming data is very different from static data. Static data does not have any connection between the dynamic time of initial processing and subsequent processing. Publication from researcher's academic contains rich information, which enables many applications such as academic search bibliographic and citation analysis.
This research was conducted with bibliometric analysis of scientific publications that become useful tools to know the process of generation and development of knowledge, as well as to evaluate the quality of the field of science and the impact it brings in the academic area [9]. Besides, bibliometric analysis can be used to find out the mapping of research from the research that is being done, already done as well as future opportunities [10].

ISSN 2355-0082
The purpose of this research is to find out the mapping of research into several scopes of technology by discussing several parameters of topics concerning Natural Language Processing, Text Mining, Remote Sensing, and Sentiment Analysis published during the pandemic. The research mapping process is carried out by the stages of the object selection process, calculating the objects interacting, and the normalization process, creating maps and displaying maps, and evaluating the map [11]. Vosviewer is used to display bibliometric map visualizations downloaded from the page: www.vosviewer.com. Bibliometric map views are visualized with Vosviewer based on author or journal name with co-citation data, or based on keywords with co-occurrence data with label map display, sketch and density, and clusters [12] Clusters in maps from Vosviewer are presented with color differences. Each parameter is operated by a clustering algorithm that can be changed so that more or fewer clusters are generated [13].

A. Data Sources and Methods
Bibliometrics is the utilization of factual strategies to dissect books, articles, and other distributions. Bibliometric strategies are habitually utilized within the field of library and data science. The sub-field of bibliometrics which concerns itself with the investigation of logical distributions is called scientometrics. Scientometrics may be a sub-field of informetric. Major investigation issues incorporate the estimation of the effect of investigation papers and scholar diaries, the understanding of logical citations, and the utilization of such suggestions for something idea in approach and administration settings.
This bibliometric data collection is done from Scopus and Google Scholar, and total data analyzed as many as a total of 2991 papers indexed with the keyword "Coronavirus with …[topic]" or "Covid 19 with…[topic]". Given the similarity of virus types before, restrictions are made on data retrieval while pandemic with the search categories used is topics, titles, and abstracts. Analysis of research trends using Vosviewer software with weighting method used is occurrence to see a lot of research on the topic and link strength to show the connectedness between research topics. Both methods analyze data based on abstraction and author.
The use of named entity recognition in the publisher homepage has problems and complexities that are generally the same as those in English, especially if using a machine learning approach. Fundamental differences exist when rule-based methods are used for completion or using hybrid model approaches between rule-based and machine learning. This approach will use unsupervised learning so that it does not require labeled data for the learning process.
The stages of the process of the proposed method are as follows: Data preparation for sequential pattern mining: In this step are prepared sentences that have named entities in it to be able to be degenerate paternal at each appearance of the entity. To avoid the amount of pattern produced, the pattern extraction process is limited to 5 words before and after the appearance of the entity. Sequential Pattern Mining: In this step, an algorithm will be applied to the existing learning data to produce the desired pattern.
Pattern Marching and Candidate Extract: Datasets for testing are prepared for custom testing with the resulting pattern. The results will be sorted according to the level of confidence and support.
Candidate Pruning: This process is carried out to improve the accuracy of the named entity produced.

B. Citation Mapping Result
Mapping is a process that allows one to identify knowledge elements and their configurations, dynamics, interdependencies, and interactions. Knowledge mapping is used for technology management purposes, which include the definition of research programs, decisions regarding technology activities, the design of knowledge base structures, and the creation of education and training programs. A Quotation Outline such as the citation mapping could be a graphical representation that appears the quotation connections (cited references and citing articles) between a paper and other papers utilizing different visualization apparatuses and procedures. The citation mapping instrument from Web of Information tracks an article's cited and cited by references through two eras. So citation mapping is a graphical representation that shows the citation relationships (cited references and citing articles) between a paper and other papers using various visualization tools and techniques. The citation mapping tool from Web of Knowledge tracks an article's cited and cited by references through two generations.

ISSN 2355-0082
In a paper conveyed earlier this year, Malcolm Tight examines the theoretical considerations around commonalities inside the approaches of communities of sharpening and Becher's insightful tribes and districts. He conducts a co-citation examination of Higher Instruction ask approximately journals; centering on maker characters and ranges, subjects, theories and examinations, methodologies and procedures, appearing as a basic diagrammatical representation of his descriptive demonstrating. Comparable thoughts of 'citation mapping' have been investigated someplace else, particularly inside the typical sciences, and a shape has as of late been displayed in the citation and journal database ISI Web of Science. And Instinctively Originator W. Bradford Paley's visualization of 800,000 coherent papers livelihoods maker citations to explore the intercontinental between science perfect models.
Related to bibliometrics, science mapping is a method of visualizing a field of science. This visualization is done by creating a landscape map that can display topics from science [14]. Information visualization is a vital portion of information science, and it is utilization d in two fundamental parts of the information science cycle: at the starting of the introduction of information investigation and within the conclusion amid the result introduction. Indeed, even though the visualization procedures are the same, these two stages have diverse objectives. Information investigation begins from numbness and tries to get the information, to find covered-up realities, designs, or exceptions. Result introduction begins with information and tries to communicate the message in the clearest and most viable way conceivable. Hence, indeed although they share the same procedures, the objective and the beginning point are diverse.
In the downloaded text records from Scopus and Google Scholar, we performed the metadata analysis for data extraction. This included extraction of title, author, year, and computer science of topic.
The morphological analysis allowed tagging of data potential use. Various issues with Covid 19 disease were encountered while tagging the data, which are described as follows: Author field: Computer science part with Covid 19 topic names are usually made up of two or three parts. It is not always clear which indicator criteria for disease Covid 19. A definition was created in Covid 19 disease names, but the punctuation mark (,) was not given: review, role, outbreak, diagnosis, approach, detection, chest x-ray, and pneumonia Types of documents in the database, we found the following types of documents Computer Science with Covid 19 disease:

 Google Scholar
For each type of document, we identified the mandatory fields, and other field values occurring in the database. The information extraction algorithm and the retrieval logic were based on these field values. A Sample of mandatory fields for each document is tabulated in Table 1. The results of the article data in table 1 above are then imported one by one into Vosviewer with txt format and compiled, inserted the title of the five fields of science filtered with topics related to Covid 19 and then carried out weighting using occurrence and total link strength which is then presented in the table below: Vosviewer analysis showed the connectedness of the NLP field resulting in 4 clusters based on color differences in figure 2 that were related to natural language, artificial intelligence, and machine learning.
In this mapping, it can be concluded that there is no research link between NLP with Text Mining, Remote Sensing, and Sentiment Analysis.

Text Mining for Covid 19
The field of Text Mining has different research links to the NLP field with 3 clusters in red, green, and blue that are closely related topics around text, system, and classification. So it can be concluded that there are still text mining research opportunities with drug, risk, and country, as well as with 3 other fields, which are visualized in figure 4 below: Vosviewer produces a research mapping analysis into 5 clusters with red, blue, green, and yellow and displays a strong correlation with information in purple, including topics of accuracy, change, factor, an erratum. In the field of sentiment analysis, the map displays very strong relationships including topics of neural networks, sentiment classification, models, papers, text, and algorithms.

B. Scopus
The second analysis was carried out on article data on Scopus with the following data:

ISSN 2355-0082
The visualization in the table above can be seen in Figure 6 which is a network visualization and Figure  7 shows the density visualization. In the picture above the topic of covid has a strong network with pandemics and there is a network with Twitter data, sentiment analysis, outbreak, and impact, and there is a correlation with the text mining approach and with natural language processing.  ISSN 2355-0082 Figure 9 above shows that Text Mining research has the strongest link with the topic of covid dan tweet. There are only 2 clusters, namely the red cluster with the topic of covid and tweets and the blue cluster with the topic of impact and person. This topic still has enormous research opportunities, especially related to covid 19. This topic has links to other research, with the strongest link strength on Twitter data, tweets, covid, and its applications. There are 4 clusters, namely the red cluster with the topic of covid, tweet, Twitter data, and era, the green cluster with the topic of the outbreak, India, lockdown, the blue cluster with the topic of impact and application, and there is 1 pandemic topic in the purple cluster. IV. CONCLUSION