Update dependency chardet to v7.1.0 #76

Merged
timatlee merged 1 commits from renovate/chardet-7.x into main 2026-03-11 17:02:00 -06:00
Collaborator

This PR contains the following updates:

Package Update Change
chardet (changelog) minor ==7.0.1==7.1.0

Release Notes

chardet/chardet (chardet)

v7.1.0: chardet 7.1.0

Compare Source

Features

  • Added PEP 263 encoding declaration detection — # -*- coding: ... -*- and # coding=... declarations on lines 1–2 of Python source files are now recognized with confidence 0.95 (#​249)
  • Added chardet.universaldetector backward-compatibility stub so that from chardet.universaldetector import UniversalDetector works with a deprecation warning (#​341)

Fixes

  • Fixed false UTF-7 detection of ASCII text containing ++ or +word patterns (#​332)
  • Fixed 0.5s startup cost on first detect() call — model norms are now computed during loading instead of lazily iterating 21M entries (#​333)
  • Fixed undocumented encoding name changes between chardet 5.x and 7.0 — detect() now returns chardet 5.x-compatible names by default (#​338)
  • Improved ISO-2022-JP family detection — recognizes ESC sequences for ISO-2022-JP-2004 (JIS X 0213) and ISO-2022-JP-EXT (JIS X 0201 Kana)
  • Fixed silent truncation of corrupt model data (iter_unpack yielded fewer tuples instead of raising)
  • Fixed incorrect date in LICENSE

Performance

  • 5.5x faster first-detect time (~0.42s → ~0.075s) by computing model norms as a side-product of load_models()
  • ~40% faster model parsing via struct.iter_unpack for bulk entry extraction (eliminates ~305K individual unpack calls)

New API parameters

  • Added compat_names parameter (default True) to detect(), detect_all(), and UniversalDetector — set to False to get raw Python codec names instead of chardet 5.x/6.x compatible display names
  • Added prefer_superset parameter (default False) — remaps legacy ISO/subset encodings to their modern Windows/CP superset equivalents (e.g., ASCII → Windows-1252, ISO-8859-1 → Windows-1252). This will default to True in the next major version (8.0).
  • Deprecated should_rename_legacy in favor of prefer_superset — a deprecation warning is emitted when used

Improvements

  • Switched internal canonical encoding names to Python codec names (e.g., "utf-8" instead of "UTF-8"), with compat_names controlling the public output format
  • Added lookup_encoding() to registry for case-insensitive resolution of arbitrary encoding name input to canonical names
  • Achieved 100% line coverage across all source modules (+31 tests)
  • Updated benchmark numbers: 98.2% encoding accuracy, 95.2% language accuracy on 2,510 test files
  • Pinned test-data cloning to chardet release version tags for reproducible builds

Full changelog: https://chardet.readthedocs.io/en/latest/changelog.html


Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR has been generated by Renovate Bot.

This PR contains the following updates: | Package | Update | Change | |---|---|---| | [chardet](https://github.com/chardet/chardet) ([changelog](https://chardet.readthedocs.io/en/latest/changelog.html)) | minor | `==7.0.1` → `==7.1.0` | --- ### Release Notes <details> <summary>chardet/chardet (chardet)</summary> ### [`v7.1.0`](https://github.com/chardet/chardet/releases/tag/7.1.0): chardet 7.1.0 [Compare Source](https://github.com/chardet/chardet/compare/7.0.1...7.1.0) #### Features - Added PEP 263 encoding declaration detection — `# -*- coding: ... -*-` and `# coding=...` declarations on lines 1–2 of Python source files are now recognized with confidence 0.95 ([#&#8203;249](https://github.com/chardet/chardet/issues/249)) - Added `chardet.universaldetector` backward-compatibility stub so that `from chardet.universaldetector import UniversalDetector` works with a deprecation warning ([#&#8203;341](https://github.com/chardet/chardet/issues/341)) #### Fixes - Fixed false UTF-7 detection of ASCII text containing `++` or `+word` patterns ([#&#8203;332](https://github.com/chardet/chardet/issues/332)) - Fixed 0.5s startup cost on first `detect()` call — model norms are now computed during loading instead of lazily iterating 21M entries ([#&#8203;333](https://github.com/chardet/chardet/issues/333)) - Fixed undocumented encoding name changes between chardet 5.x and 7.0 — `detect()` now returns chardet 5.x-compatible names by default ([#&#8203;338](https://github.com/chardet/chardet/issues/338)) - Improved ISO-2022-JP family detection — recognizes ESC sequences for ISO-2022-JP-2004 (JIS X 0213) and ISO-2022-JP-EXT (JIS X 0201 Kana) - Fixed silent truncation of corrupt model data (`iter_unpack` yielded fewer tuples instead of raising) - Fixed incorrect date in LICENSE #### Performance - 5.5x faster first-detect time (\~0.42s → \~0.075s) by computing model norms as a side-product of `load_models()` - \~40% faster model parsing via `struct.iter_unpack` for bulk entry extraction (eliminates \~305K individual `unpack` calls) #### New API parameters - Added `compat_names` parameter (default `True`) to `detect()`, `detect_all()`, and `UniversalDetector` — set to `False` to get raw Python codec names instead of chardet 5.x/6.x compatible display names - Added `prefer_superset` parameter (default `False`) — remaps legacy ISO/subset encodings to their modern Windows/CP superset equivalents (e.g., ASCII → Windows-1252, ISO-8859-1 → Windows-1252). **This will default to `True` in the next major version (8.0).** - Deprecated `should_rename_legacy` in favor of `prefer_superset` — a deprecation warning is emitted when used #### Improvements - Switched internal canonical encoding names to Python codec names (e.g., `"utf-8"` instead of `"UTF-8"`), with `compat_names` controlling the public output format - Added `lookup_encoding()` to `registry` for case-insensitive resolution of arbitrary encoding name input to canonical names - Achieved 100% line coverage across all source modules (+31 tests) - Updated benchmark numbers: 98.2% encoding accuracy, 95.2% language accuracy on 2,510 test files - Pinned test-data cloning to chardet release version tags for reproducible builds **Full changelog:** <https://chardet.readthedocs.io/en/latest/changelog.html> </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My42NC4yIiwidXBkYXRlZEluVmVyIjoiNDMuNjQuMiIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOltdfQ==-->
renovate-bot added 1 commit 2026-03-11 16:00:20 -06:00
Update dependency chardet to v7.1.0
All checks were successful
Build Docker Image / build (pull_request) Successful in 1m36s
63cef090de
timatlee merged commit c3b7969f99 into main 2026-03-11 17:02:00 -06:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: timatlee/cloudflare-ddns-docker-updated#76