Browser Fingerprint Coding Methods Increasing the Effectiveness of User Identification in the Web Traffic

Marcin Gabryel 1 , Konrad Grzanek 2 ,  and Yoichi Hayashi 3
  • 1 Department of Computer Engineering, Czestochowa University of Technology, 42-200, Częstochowa, Poland
  • 2 Information Technology Institute, University of Social Sciences, Clark University, , 90 - 113, Lodz
  • 3 Department of Computer Science, Meiji University, Japan


Web-based browser fingerprint (or device fingerprint) is a tool used to identify and track user activity in web traffic. It is also used to identify computers that are abusing online advertising and also to prevent credit card fraud. A device fingerprint is created by extracting multiple parameter values from a browser API (e.g. operating system type or browser version). The acquired parameter values are then used to create a hash using the hash function. The disadvantage of using this method is too high susceptibility to small, normally occurring changes (e.g. when changing the browser version number or screen resolution). Minor changes in the input values generate a completely different fingerprint hash, making it impossible to find similar ones in the database. On the other hand, omitting these unstable values when creating a hash, significantly limits the ability of the fingerprint to distinguish between devices. This weak point is commonly exploited by fraudsters who knowingly evade this form of protection by deliberately changing the value of device parameters. The paper presents methods that significantly limit this type of activity. New algorithms for coding and comparing fingerprints are presented, in which the values of parameters with low stability and low entropy are especially taken into account. The fingerprint generation methods are based on popular Minhash, the LSH, and autoencoder methods. The effectiveness of coding and comparing each of the presented methods was also examined in comparison with the currently used hash generation method. Authentic data of the devices and browsers of users visiting 186 different websites were collected for the research.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • [1] Kristol D.M., HTTP cookies: Standards, privacy, and politics, ACM Trans. Internet Techn. 1 (2) (2001) 151–198.

  • [2] Low C., Cookie law explained, 2016. on-line (retrieved:03/2020).

  • [3] Alaca, F., Van Oorschot, P. C. (2016, December). Device fingerprinting for augmenting web authentication: classification and analysis of methods. In Proceedings of the 32nd Annual Conference on Computer Security Applications (pp. 289-301).

  • [4] Nagaraja, S., Shah, R. (2019, May). Clicktok: click fraud detection using traffic analysis. In Proceedings of the 12th Conference on Security and Privacy in Wireless and Mobile Networks (pp. 105-116).

  • [5] Mouawi, R., Elhajj, I.H., Chehab, A. et al. Crowd-sourcing for click fraud detection. EURASIP J. on Info. Security 2019, 11 (2019)

  • [6] Dave, V., Guha, S., Zhang, Y. (2012, August). Measuring and fingerprinting click-spam in ad networks. In Proceedings of the ACM SIGCOMM 2012 conference on Applications, technologies, architectures, and protocols for computer communication (pp. 175-186).

  • [7] Vastel, A., Rudametkin, W., Rouvoy, R., Blanc, X. (2020, February). FP-Crawlers: Studying the Resilience of Browser Fingerprinting to Block Crawlers. In NDSS Workshop on Measurements, Attacks, and Defenses for the Web (MADWeb’20).

  • [8] 2019.

  • [9] Barker S., “Future Digital Advertising, Artificial Intelligence & Advertising Fraud 2019-2023”, Juniper Research, 2019

  • [10] Eckersley P., How unique is your web browser? in: Privacy Enhancing Technologies, 10th International Symposium, PETS 2010, Berlin, Germany, July 21-23, 2010. Proceedings, 2010, pp. 1–18

  • [11] Laperdrix, P., Bielova, N., Baudry, B., Avoine, G. (2019). Browser Fingerprinting: A survey. arXiv preprint arXiv:1905.01051.

  • [12] Kobusinska, A., Pawluczuk, K., Brzezinski, J. (2018). Big Data fingerprinting information analytics for sustainability. Future Generation Computer Systems, 86, 1321-1337.

  • [13] Mayer J R. 2009. Any person... a pamphleteer”: Internet Anonymity in the Age of Web 2.0. Undergraduate Senior Thesis, Princeton University (2009).

  • [14] Steven E. and Arvind N. 2016. Online Tracking: A 1-million-site Measurement and Analysis. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS ’16). ACM, New York, NY, USA, 1388–1401.

  • [15] Gómez-Boix, A., Laperdrix, P., Baudry, B. (2018, April). Hiding in the crowd: an analysis of the effectiveness of browser fingerprinting at large scale. In Proceedings of the 2018 world wide web conference (pp. 309-318).

  • [16] Cao, Y., Li, S., Wijmans, E. (2017, March). (Cross-) Browser Fingerprinting via OS and Hardware Level Features. In NDSS.

  • [17] 2020. The Evolution of Hi-Def Fingerprinting in Bot Mitigation - Distil Networks.

  • [18] 2020. Device Tracking Add-on for minFraud Services - MaxMind.

  • [19] Bursztein, E., Malyshev, A., Pietraszek, T., Thomas, K. (2016, October). Picasso: Lightweight device class fingerprinting for web clients. In Proceedings of the 6th Workshop on Security and Privacy in Smartphones and Mobile Devices (pp. 93-102).

  • [20] Renjith, S. (2018). Detection of Fraudulent Sellers in Online Marketplaces using Support Vector Machine Approach. arXiv preprint arXiv:1805.00464.

  • [21] Zhang, X., Han, Y., Xu, W., Wang, Q. (2019). HOBA: A novel feature engineering methodology for credit card fraud detection with a deep learning architecture. Information Sciences.

  • [22] Ludwig, S. A. (2019). Applying a neural network ensemble to intrusion detection. Journal of Artificial Intelligence and Soft Computing Research, 9(3), 177-188.

  • [23] de Souza, G. B., da Silva Santos, D. F., Pires, R. G., Marana, A. N., Papa, J. P. (2019). Deep features extraction for robust fingerprint spoofing attack detection. Journal of Artificial Intelligence and Soft Computing Research, 9(1), 41-49.

  • [24] Salakhutdinov, R., Hinton, G. (2009). Semantic hashing. International Journal of Approximate Reasoning, 50(7), 969-978.

  • [25] 2020. FingerprintJS. Fraud detection API.

  • [26] Leskovec J., Rajaraman A., Ullman J.D.: Mining of Massive Datasets, Cambridge University Press, 2014

  • [27] Azgomi, H., Mahjur, A. (2013). A Solution for Calculating the False Positive and False Negative in LSH Method to Find Similar Documents. Journal of Basic and Applied Research, 3, 466-472.

  • [28] Goodfellow, Ian; Bengio, Yoshua; Courville, Aaron (2016). Deep Learning. MIT Press

  • [29] Bengio Y., Learning deep architectures for ai Found. Trends Mach. Learn., vol. 2, no. 1, pp. 1–127, Jan. 2009.

  • [30] Olson, D.L., Delen, D.: Advanced Data Mining Techniques, 1st edn. Springer, Heidelberg (2008).


Journal + Issues