Academic Research Library

Find some of the best Journals and Proceedings.

Enhancing Personally Identifiable Information Recognition Using Compact Transformer with Hybrid Sequential Architecture

Author : Pattarapoom Nokkaew

Abstract : The rapid expansion of unstructured data across organizational repositories has significantly increased the risk of sensitive information leakage and regulatory non-compliance. Named Entity Recognition for Personally Identifiable Information plays a crucial role in automated data protection systems by identifying and classifying sensitive entities such as personal names, identification numbers, addresses, and financial information. While traditional rule-based approaches offer interpretability, they often suffer from limited generalization and low recall. In contrast, deep learning-based approaches have demonstrated superior performance but are frequently constrained by high computational complexity and memory requirements. This research investigates the effectiveness of compact Transformer architectures for PII-NER, focusing on the DeBERTa v3 xsmall model as the primary backbone. With approximately 22 million backbone parameters, DeBERTa v3 xsmall provides a favorable trade-off between model size and representational power. To further enhance sequence modeling capabilities, this study integrates Bidirectional Long Short-Term Memory, Bidirectional Gated Recurrent Unit, and Conditional Random Fields layers on top of the Transformer encoder. Experiments were conducted on the English portion of the AI4Privacy PII dataset containing approximately 43,000 annotated samples. The experimental results demonstrate that the hybrid architecture combining DeBERTa v3 xsmall + BiGRU + CRF achieves the best performance, with an F1-score of 0.9139 after 15 training epochs. The findings indicate that integrating compact Transformer backbones with sequential modeling and structured prediction layers significantly improves label consistency and entity boundary detection while maintaining computational efficiency. This study highlights the practical feasibility of deploying lightweight yet high performance PII-NER systems in real-world environments with limited hardware resources.

Keywords : Compact Transformer, Named Entity Recognition, Personal Identifiable Information.

Conference Name : International Conference on Natural Language Processing and Computational Linguistics (ICNLPCL-26)

Conference Place : Phuket, Thailand

Conference Date : 19th Mar 2026

Presentation Video

Preview