Vom 20.12.2025 bis 11.01.2026 ist die Universitätsbibliothek geschlossen. Ab dem 12.01.2026 gelten wieder die regulären Öffnungszeiten. Ausnahme: Medizinische Hauptbibliothek und Zentralbibliothek sind bereits ab 05.01.2026 wieder geöffnet. Weitere Informationen

Treffer: Preprocessing framework for scholarly big data management.

Title:
Preprocessing framework for scholarly big data management.
Source:
Multimedia Tools & Applications; Oct2023, Vol. 82 Issue 25, p39719-39743, 25p
Database:
Complementary Index

Weitere Informationen

Big data technologies have found applications in disparate domains. One of the largest sources of textual big data is scientific documents and papers. Scholarly big data has been used in numerous ways to develop innovative applications such as collaborator discovery, expert finding and research management systems. With the evolution of machine and deep learning techniques, the efficacy of such applications has risen manifold. However, the biggest challenge in the development of deep learning models for scholarly applications in cloud-based environment is the under-utilization of resources because of the excessive time required for textual preprocessing. This paper presents a preprocessing pipeline that uses Spark for data ingestion and Spark ML for performing preprocessing tasks. The proposed approach is evaluated with the help of a case study, which uses LSTM-based text summarization to generate title or summaries from abstracts of scholarly articles. Results indicate a substantial reduction in ingestion, preprocessing and cumulative time for the proposed approach, which shall manifest reduction in development time and costs as well. [ABSTRACT FROM AUTHOR]

Copyright of Multimedia Tools & Applications is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)