Please use this identifier to cite or link to this item:
http://dx.doi.org/10.25673/118988
Title: | PaSSw0rdVib3s! : AI-assisted password recognition for digital forensic investigations |
Author(s): | Dijk, Romke Wetering, Judith Argentini, Ranieri Gorka, Leonie Luenen, Anne Fleur Minnema, Sieds Rijgersberg, Edwin Ugen, Mattijs Mann, Zoltán Ádám ![]() Geradts, Zeno |
Issue Date: | 2025 |
Type: | Article |
Language: | English |
Abstract: | In digital forensic investigations, the ability to identify passwords in cleartext within digital evidence is often essential for the acquisition of data from encrypted devices. Passwords may be stored in cleartext, knowingly or accidentally, in various locations within a device, e.g., in text messages, notes, or system log files. Finding those passwords is a challenging task, as devices typically contain a substantial amount and a wide variety of textual data. This paper explores the performance of several different types of machine learning models trained to distinguish passwords from non-passwords, and ranks them according to their likelihood of being a human-generated password. Three deep learning models (PassGPT, CodeBERT and DistilBERT) were fine-tuned, and two traditional machine learning models (a feature-based XGBoost and a TF/IDF-based XGBoost) were trained. These were compared to the existing state-of-the-art technology, a password recognition model based on probabilistic context-free grammars. Our research shows that the fine-tuned PassGPT model outperforms the other models. We show that the combination of multiple different types of training datasets, carefully chosen based on the context, is needed to achieve good results. In particular, it is important to train not only on dictionary words and leaked credentials, but also on data scraped from chats and websites. Our approach was evaluated with realistic hardware that could fit inside an investigator's workstation. The evaluation was conducted on the publicly available RockYou and MyHeritage leaks, but also on a dataset derived from real casework, showing that these innovations can indeed be used in a real forensic context. |
URI: | https://opendata.uni-halle.de//handle/1981185920/120944 http://dx.doi.org/10.25673/118988 |
Open Access: | ![]() |
License: | ![]() |
Journal Title: | Forensic Science International. Digital investigation |
Publisher: | Elsevier ScienceDirect |
Publisher Place: | [Amsterdam] |
Volume: | 52 |
Original Publication: | 10.1016/j.fsidi.2025.301870 |
Page Start: | 1 |
Page End: | 8 |
Appears in Collections: | Open Access Publikationen der MLU |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
1-s2.0-S2666281725000095-main.pdf | 1.64 MB | Adobe PDF | ![]() View/Open |