No | Nama | judul Penelitian | Abstrack | Link Publish | Foto | |
---|---|---|---|---|---|---|
1 | Veren Prisscilya | veren.prisscilya@binus.ac.id | Classification of Indonesia False News Detection Using Bertopic and Indobert | Pada era global saat ini, perkembangan teknologi dan informasi sangat pesat, sehingga sangat mudah untuk mendapatkan informasi/berita dari internet. Karena kemudahan untuk mendapatkan informasi tersebut, maka banyak beredar berita palsu (hoax), berita tersebut tidak tersaring sehingga siapa saja dapat menyebarkan berita yang tidak jelas isinya. Hal tersebut dapat menurunkan kredibilitas seseorang dalam dunia profesional, menimbulkan perpecahan, mengancam kesehatan fisik dan mental, serta dapat pula mengakibatkan kerugian materil. Berdasarkan hal tersebut, untuk menghentikan penyebaran hoax adalah dengan mendeteksinya sedini mungkin dan memblokirnya. Deteksi tersebut dapat menggunakan metode deep learning yang juga merupakan salah satu arsitektur transformer yaitu gabungan dari BERTopic yang digunakan untuk mencari kata-kata penting dari narasi berita, kemudian kata-kata tersebut digabungkan ke dalam narasi dan diklasifikasikan menggunakan Indo Bidirectional Encoder Representation dari Transformer (IndoBERT). Untuk percobaan, penulis menggunakan dataset yang diambil dari website kaggle.com yang berjudul dataset Indonesia False News (HOAX). Penelitian ini menggunakan laju pembelajaran 1e-5, ukuran batch 16, dan 5 epoch sehingga hasil f1-Score adalah 92% untuk data validasi dan 91% untuk data pengujian. | https://jist.publikasiindonesia.id/index.php/jist/article/view/1310 |
|
2 | Andien Dwi Novika | andien.novika@binus.ac.id | Multi-layer perceptron hyperparameter optimization using Jaya algorithm for disease classification | This study introduces an innovative hyperparameter optimization approach for enhancing multilayer perceptrons (MLP) using the Jaya algorithm. Addressing the crucial role of hyperparameter tuning in MLP’s performance, the Jaya algorithm, inspired by social behavior, emerges as a promising optimization technique without algorithm-specific parameters. Systematic application of Jaya dynamically adjusts hyperparameter values, leading to notable improvements in convergence speeds and model generalization. Quantitatively, the Jaya algorithm consistently achieves convergences at first iteration, faster convergence compared to conventional methods, resulting in 7% higher accuracy levels on several datasets. This research contributes to hyperparameter optimization, offering a practical and effective solution for optimizing MLP in diverse applications, with implications for improved computational efficiency and model performance | https://www.researchgate.net/profile/Andien-Novika-2/publication/381868452_Multi-layer_perceptron_hyperparameter_optimization_using_Jaya_algorithm_for_disease_classification/links/668ce4f0c1cf0d77ffc3b1ab/Multi-layer-perceptron-hyperparameter-optimization-using-Jaya-algorithm-for-disease-classification.pdf |
|
3 | Julyanto Wijaya | - | Indonesian News Extractive Summarization using Lexrank and YAKE Algorithm | The surge in global technological advancements has led to an unprecedented volume of information sharingacross diverse platforms. This information, easily accessible through browsers, has created an overload, making itchallenging for individuals to efficiently extract essential content. In response, this paper proposes a hybrid AutomaticText Summarization (ATS) method, combining LexRank and YAKE algorithms. LexRank determines sentence scores,while YAKE calculates individual word scores, collectively enhancing summarization accuracy. Leveraging an unsupervisedlearning approach, the hybrid model demonstrates a 2% improvement over its base model. To validate the effectiveness of theproposed method, the paper utilizes 5000 Indonesian news articles from the Indosum dataset. Ground-truth summaries areemployed, with the objective of condensing each article to 30% of its content. The algorithmic approach and experimentalresults are presented, offering a promising solution to information overload. Notably, the results reveal a two percentimprovement in the Rouge-1 and Rouge-2 scores, along with a one percent enhancement in the Rouge-L score. Thesefindings underscore the potential of incorporating a keyword score to enhance the overall accuracy of the summariesgenerated by LexRank. Despite the absence of a machine learning model in this experiment, the unsupervised learningand heuristic approach suggest broader applications on a global scale. A comparative analysis with other state-of-the-art textsummarization methods or hybrid approaches will be essential to gauge its overall effectiveness. | http://www.iapress.org/index.php/soic/article/view/1976/1102 |
|
4 | Yusuf Priyo Anggodo | - | A Novel Modified Binning and Logistics Regression to Handle Shifting in Credit Scoring | The development of financial technology (Fintech) in emerging economies such as Indonesia has been rapid in the last few years, opening a great potential for loan businesses, from venture capital to micro and personal loans. To survive in such competitive markets, new companies need a robust credit-scoring model. However, building a reliable model requires large stable data. The challenge is that datasets are often small, covering only a few months (short-period datasets). Therefore, this study proposes a modified binning method, namely changing a variable’s values into two groups with the smallest distribution differences possible. Modified binning can maintain data trends to avoid future shifting. The simulation was conducted using a real dataset from Indonesian Fintech, comprising 44,917 borrower-level observations with 396 variables. To match the actual conditions, the first three months of data were allocated for modeling and the remaining for testing. Implementing modified binning and logistics regression to testing data results in a more stable score band than standard binning. Compared with other classifier methods, the proposed method obtained the best AUC results on the testing data (0.73). In addition, the proposed method is highly applicable as it can provide a straightforward explanation to upper management or regulators. It is practical to use in real-case financial technology with short-period problems. | https://link.springer.com/article/10.1007/s10614-023-10410-6 |
|
5 | Bambang Nursandi | bambang.nursandi@binus.ac.id | Waste Pollution Classification in Indonesian Language using DistilBERT | Di Indonesia, pencemaran limbah menimbulkan tantangan lingkungan dan kesehatan yang mendesak, sehingga klasifikasi yang akurat sangat penting untuk upaya mitigasi yang terarah. Penelitian kami bertujuan untuk mengekstrak data yang relevan dari Twitter guna mengatasi masalah ini dan menilai seberapa efektif model DistilBERT dapat mengklasifikasikan teks dalam bahasa Indonesia terkait pencemaran limbah. DistilBERT, padanan yang lebih ramping dari arsitektur BERT yang diakui, dirancang untuk mencerminkan pemahaman linguistik BERT yang canggih tetapi dengan tuntutan komputasi yang lebih rendah. Dengan memanfaatkan esensi pembelajaran transfer, metode yang diusulkan menggunakan DistilBERT mendapat manfaat dari kumpulan data tekstual yang luas, sehingga ideal untuk skenario dengan aksesibilitas data terbatas. Kami mengadopsi DistilBERT untuk tantangan khusus dalam mengklasifikasikan jenis limbah menggunakan kumpulan data terbatas yang berasal dari percakapan Twitter dalam bahasa Indonesia—media yang dikenal karena kontennya yang ringkas dan sering kali ambigu. Meskipun cakupan kumpulan data terbatas dan gangguan yang melekat pada data Twitter, hasil penelitian menggunakan DistilBERT menunjukkan kemanjuran yang mencengangkan, mencapai Presisi: 98%, Ingat: 98%, dan Skor F1: 98%. Hasil ini menggarisbawahi kemampuan DistilBERT untuk menavigasi dan memahami nuansa tekstual yang kompleks dalam lingkungan yang terbatas datanya. Penelitian kami juga mencakup analisis komparatif dengan metode lain, yang selanjutnya menyoroti pentingnya pembelajaran transfer dalam mengatasi tantangan pemrosesan bahasa alami, khususnya dalam konteks kritis seperti upaya pengelolaan limbah di Indonesia. | https://gemawiralodra.unwir.ac.id/index.php/gemawiralodra/article/view/645 |
|
6 | Bima Krisna Noveta | bambang.nursandi@binus.ac.id | Six classes named entity recognition for mapping location of Indonesia natural disasters from twitter data | Purpose The purpose of this study is to provide the location of natural disasters that are poured into maps by extracting Twitter data. The Twitter text is extracted by using named entity recognition (NER) with six classes hierarchy location in Indonesia. Moreover, the tweet then is classified into eight classes of natural disasters using the support vector machine (SVM). Overall, the system is able to classify tweet and mapping the position of the content tweet. Design/methodology/approach This research builds a model to map the geolocation of tweet data using NER. This research uses six classes of NER which is based on region Indonesia. This data is then classified into eight classes of natural disasters using the SVM. Findings Experiment results demonstrate that the proposed NER with six special classes based on the regional level in Indonesia is able to map the location of the disaster based on data Twitter. The results also show good performance in geocoding such as match rate, match score and match type. Moreover, with SVM, this study can also classify tweet into eight classes of types of natural disasters specifically for the Indonesian region, which originate from the tweets collected. Research limitations/implications This study implements in Indonesia region. Originality/value (a)NER with six classes is used to create a location classification model with StanfordNER and ArcGIS tools. The use of six location classes is based on the Indonesia regional which has the large area. Hence, it has many levels in its regional location, such as province, district/city, sub-district, village, road and place names. (b) SVM is used to classify natural disasters. Classification of types of natural disasters is divided into eight: floods, earthquakes, landslides, tsunamis, hurricanes, forest fires, droughts and volcanic eruptions. |
https://www.emerald.com/insight/content/doi/10.1108/IJICC-09-2023-0251/full/html |
|
7 | Natasha Alyaa Anindyaputri | natasha.anindyaputri@binus.ac.id | A Comparative Study of Deep Learning Models for Detecting Depressive Disorder in Tweets | Gangguan depresi merupakan masalah kejiwaan yang memiliki kontribusi tinggi terhadap penyebab disabilitas di dunia, karena dapat menyebabkan orang yang mengalaminya tidak dapat berprestasi dalam kehidupan sehari-hari yang dapat mempengaruhi kehidupan kerja mereka. Untuk membantu pasien yang membutuhkan penanganan segera, Organisasi Kesehatan Dunia (WHO) telah membuat program yang disebut Mental Health Gap Action Program (mhGAP) dalam rangka meningkatkan pembangunan yang berfokus pada pemantauan kesehatan mental di dunia yang mengglobal. Penelitian ini mengeksplorasi beberapa arsitektur model klasifikasi yang berbeda dengan pendekatan deep learning untuk mendeteksi tweet pengguna Twitter yang diklasifikasikan sebagai tweet dengan gangguan depresi dan kemudian membandingkan hasilnya antara model yang diusulkan. Hasilnya, model Long Short-Term Memory (LSTM) yang dipasangkan dengan model representasi kata Transformer telah memberikan kinerja yang lebih baik jika dibandingkan dengan menggunakan klasifikasi dasar Bidirectional Encoder Representations from Transformers (BERT) atau model klasifikasi LSTM saja. Evaluasi arsitektur model klasifikasi mengungkapkan bahwa model LSTM, jika dipasangkan dengan MentalBERT, menunjukkan kinerja akurasi tertinggi, mencapai skor akurasi 0,86. Pendekatan ini memanfaatkan keunggulan masing-masing model transformer dan LSTM untuk memfasilitasi analisis akurat informasi sintaksis dan kontekstual yang terkait dengan kata-kata individual, sehingga memungkinkan representasi yang tepat atau domain spesifik dari data sekuensial. | https://www.aasmr.org/jsms/Vol14/No.3/Vol.14.No.3.18.pdf |
|
8 | Anindra Ageng Jihado | anindra.jihado@binus.ac.id | Hybrid Deep Learning Network Intrusion Detection System Based on Convolutional Neural Network and Bidirectional Long Short-Term Memory | Network security has become crucial in an era where information and data are valuable assets. An effective Network Intrusion Detection System (NIDS) is required to protect sensitive data and information from cyberattacks. Numerous studies have created NIDS using machine learning algorithms and network datasets that do not accurately reflect actual network data flows. Increasing hardware capabilities and the ability to process big data have made deep learning the preferred method for developing NIDS. This study develops a NIDS model using two deep learning algorithms: Convolutional Neural Network (CNN) and Bidirectional Long-Short Term Memory (BiLSTM). CNN extracts spatial features in the proposed model, while BiLSTM extracts temporal features. Two publicly available benchmark datasets, CICIDS2017 and UNSW-NB15, are used to evaluate the model. The proposed model surpasses the previous method in terms of accuracy, achieving 99.83% and 99.81% for binary and multiclass classification on the CICIDS2017 dataset. On the UNSW-NB15 dataset, the model achieves accuracies of 94.22% and 82.91% for binary and multiclass classification, respectively. Moreover, Principal Component Analysis (PCA) is also used for feature engineering to improve the speed of model training and reduce existing features to ten dimensions without significantly impacting the model’s performance. | https://www.jait.us/uploadfile/2024/JAIT-V15N2-219.pdf |
|
9 | Rahmi Fadillah Busyra | rahmi.busyra@binus.ac.id | Applying Long Short-Term Memory Algorithm for Spam Detection on Ministry Websites | Spam mengacu pada pesan yang tidak diminta yang berisi konten berbahaya seperti malware, virus, phishing, dan pencurian data. Formulir web situs web kementerian pemerintah sering menjadi sasaran spammer, yang menyebabkan gangguan, kelebihan beban basis data, menghambat komunikasi dengan publik, dan risiko keamanan. Meskipun banyak penelitian telah difokuskan pada deteksi spam, tidak ada yang membahas deteksi spam pada pengiriman formulir web dan deteksi spam multibahasa, khususnya dalam bahasa Inggris dan Indonesia. Penelitian ini mengembangkan model deteksi spam untuk mengatasi tantangan yang semakin meningkat dari pesan spam yang diterima melalui formulir web situs web kementerian. Model yang diusulkan menggunakan algoritma Long Short-Term Memory (LSTM) untuk mendeteksi spam dalam bahasa Inggris dan Indonesia secara efektif. Model LSTM menggabungkan tahapan tambahan untuk meningkatkan kinerjanya, termasuk deteksi bahasa, penambahan data, dan penyisipan kata. Hasil evaluasi menunjukkan keefektifan model dalam mengklasifikasikan pesan spam dan non-spam, khususnya dalam kumpulan data dengan distribusi kelas yang seimbang. Penelitian ini memiliki implikasi praktis untuk menerapkan model tersebut pada situs web, khususnya situs web kementerian pemerintah, untuk mengkategorikan pesan masuk secara efektif dan mengurangi dampak spam. Studi ini juga memberikan kontribusi secara teoritis dengan menunjukkan efektivitas LSTM dalam pendeteksian spam dan menekankan pentingnya augmentasi data dalam menangani kumpulan data yang tidak seimbang. Secara keseluruhan, studi ini memberikan wawasan berharga dan solusi praktis untuk pendeteksian spam dalam formulir web, yang dapat diterapkan pada situs web kementerian pemerintah, dan memperluas cakupan pendeteksian spam dalam berbagai bahasa, khususnya bahasa Inggris dan Indonesia. | https://www.jait.us/uploadfile/2024/JAIT-V15N2-219.pdf |
|
10 | DIAN ANGGRAINI | dian.anggraini@binus.ac.id | WEBSHELL DETECTION BASED ON BYTECODE FEATURE WITH CONVOLUTIONAL NEURAL NETWORK | Web shell is a malicious program used to remotely access web servers during cyberattacks. Malicious web shells closely resemble benign web shells, making them difficult to distinguish. The challenge in detecting pre-existing web shells is that this type of malware is hard to detect using an intrusion detection system (IDS) or antivirus techniques. This is because web shells are usually hidden within web applications, making them challenging to differentiate from regular web application source code. Therefore, traditional detection models that analyze the dynamic features of web shell script execution are more effective in detecting existing malware attacks. In this study, A method of web shell detection based on dynamic bytecode features using a convolutional neural network (CNN) has been proposed in this research. Word2vec is employed to obtain vectorized features from the bytecode or opcode. Experimental results using a training dataset of 2577 samples and a validation dataset of 645 samples yield the best model with an accuracy of 99.86% at epoch 100. The experiments demonstrate that this model effectively detects web shells, with a significant increase in accuracy levels. | https://www.jatit.org/volumes/Vol101No18/26Vol101No18.pdf |
|