mkdocs-material: Blog category accent problem

Context

Something has changed, it was still working 1-2 months ago. I installed the latest version.

Bug description

If the category name contains an accented character, then:

ERROR   -  Encoding error reading file: blog\category\általános.md
ERROR   -  Error reading page 'blog/category/általános.md': 'utf-8' codec can't decode byte 0xc1 in position 2: invalid start byte
Traceback (most recent call last):
...

Related links

categories

Reproduction

9.4.1+insiders.4.42.0-accent.zip

Steps to reproduce

Just uncomment #- Általános in sample-post-1.md file.

---
date: 2022-01-01
categories:
  - Category 1
  #- Általános
...

and mkdocs serve.

Browser

Chrome

Before submitting

About this issue

  • Original URL
  • State: closed
  • Created 9 months ago
  • Comments: 24 (10 by maintainers)

Most upvoted comments

Thanks guys! 😃

Snag_5981131

@squidfunk will do a pull request.

Yes, that does fix the problem for me. Well spotted!

Sorry, I used the second line, also added in a comma before encoding. Let me check the first once I am back at the Windows PC.

Thanks, this is very useful. I am able to reproduce the problem and am setting up a development environment on a Windows box to try and debug. Might be tomorrow before I can really have a look at this, though.

Without blog, theme Material --> ok

Snag_f019d

Without blog, theme MkDocs–> ok Snag_1109ff

általános.md

Snag_151314

With blog again, and uncommented Általános category:

Snag_18bca4

Error reading page 'blog/category/altalanos.md': 'utf-8' codec can't decode
           byte 0xc1 in position 2: invalid start byte
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\yyyy\AppData\Local\Programs\Python\Python311\Scripts\mkdocs.exe\__main__.py", line 7, in <module>
  File "C:\Users\yyyy\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\yyyy\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "C:\Users\yyyy\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\yyyy\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\yyyy\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\yyyy\AppData\Local\Programs\Python\Python311\Lib\site-packages\mkdocs\__main__.py", line 270, in serve_command
    serve.serve(**kwargs)
  File "C:\Users\yyyy\AppData\Local\Programs\Python\Python311\Lib\site-packages\mkdocs\commands\serve.py", line 86, in serve
    builder(config)
  File "C:\Users\yyyy\AppData\Local\Programs\Python\Python311\Lib\site-packages\mkdocs\commands\serve.py", line 67, in builder
    build(config, live_server=None if is_clean else server, dirty=is_dirty)
  File "C:\Users\yyyy\AppData\Local\Programs\Python\Python311\Lib\site-packages\mkdocs\commands\build.py", line 322, in build
    _populate_page(file.page, config, files, dirty)
  File "C:\Users\yyyy\AppData\Local\Programs\Python\Python311\Lib\site-packages\mkdocs\commands\build.py", line 167, in _populate_page
    page.read_source(config)
  File "C:\Users\yyyy\AppData\Local\Programs\Python\Python311\Lib\site-packages\material\plugins\blog\structure\__init__.py", line 229, in read_source
    super().read_source(config)
  File "C:\Users\yyyy\AppData\Local\Programs\Python\Python311\Lib\site-packages\mkdocs\structure\pages.py", line 203, in read_source
    source = f.read()
             ^^^^^^^^
  File "<frozen codecs>", line 322, in decode
  File "C:\Users\yyyy\AppData\Local\Programs\Python\Python311\Lib\encodings\utf_8_sig.py", line 69, in _buffer_decode
    return codecs.utf_8_decode(input, errors, final)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc1 in position 2: invalid start byte

Do you need anything else?

The original file was utf-8 encoded (had a look with a hex editor) and I assume vim has not changed that when I edited the file to put the umlaute in.

Right, made the offending blog post a normal page and the site builds fine. Also changed the text to something with German ÄÖÜ and this causes a different error:

Traceback (most recent call last):
  File "C:\Users\Alex Voss\src\mkdocs-material\reproduce\accent\venv\Lib\site-packages\mkdocs\livereload\__init__.py", line 193, in _build_loop
    func()
  File "C:\Users\Alex Voss\src\mkdocs-material\reproduce\accent\venv\Lib\site-packages\mkdocs\commands\serve.py", line 67, in builder
    build(config, live_server=None if is_clean else server, dirty=is_dirty)
  File "C:\Users\Alex Voss\src\mkdocs-material\reproduce\accent\venv\Lib\site-packages\mkdocs\commands\build.py", line 304, in build
    files = config.plugins.on_files(files, config=config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Alex Voss\src\mkdocs-material\reproduce\accent\venv\Lib\site-packages\mkdocs\plugins.py", line 533, in on_files
    return self.run_event('files', files, config=config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Alex Voss\src\mkdocs-material\reproduce\accent\venv\Lib\site-packages\mkdocs\plugins.py", line 507, in run_event
    result = method(item, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Alex Voss\src\mkdocs-material\reproduce\accent\venv\Lib\site-packages\material\plugins\blog\plugin.py", line 145, in on_files
    self.blog.views.extend(views)
  File "C:\Users\Alex Voss\src\mkdocs-material\reproduce\accent\venv\Lib\site-packages\material\plugins\blog\plugin.py", line 603, in _generate_categories
    self._save_to_file(file.abs_src_path, f"# {name}")
  File "C:\Users\Alex Voss\src\mkdocs-material\reproduce\accent\venv\Lib\site-packages\material\plugins\blog\plugin.py", line 876, in _save_to_file
    f.write(content)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.1520.0_x64__qbz5n2kfra8p0\Lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode characters in position 3-4: character maps to <undefined>

I will see that I set up a development environment on Windows so I can have a look.