filebrowser: Some txt file can be preview and edit, but some can not.

Description Some txt file can be preview and edit, but some can not.

Expected behaviour All txt file can be preview? Otherwise why preview and edit function?

What is happening instead? Some txt file can be preview and edit, but some can not. Like I have a file.txt in ./files, it just can be download, it can not being preview and edit. But test.txt in ./files/test can being preview and edit, and of course download too.

Additional context I am using filebrowser in ubuntu 18.04 lts x86_64, launch it by a custom server:

#/etc/systemd/system/filebrowser.service

[Unit]
Description=File Browser v2 Service
After=network.target
Wants=network.target

[Service]
Type=simple
PIDFile=/var/run/filebrowser.pid
ExecStart=/usr/local/bin/filebrowser -c=/home/filebrowser/config.json -d=/home/filebrowser/filebrowser.db -r=/home/filebrowser
RemainAfterExit=no
Restart=on-failure
RestartPreventExitStatus=23

[Install]
WantedBy=multi-user.target

file /home/filebrowser/config.json (seem does not have effect at all):

{
  "settings": {
    "signup": false,
    "defaults": {
      "scope": "./files",
      "locale": "zh-tw",
      "viewMode": "list",
      "sorting": {
        "by": "name",
        "asc": false
      },
      "perm": {
        "admin": false,
        "execute": false,
        "create": false,
        "rename": false,
        "modify": true,
        "delete": false,
        "share": false,
        "download": true
      },
      "commands": []
    },
    "authMethod": "json",
    "branding": {
      "name": "Magical Space J1",
      "disableExternal": false,
      "files": "/home/filebrowser"
    },
    "commands": {
      "after_copy": [],
      "after_delete": [],
      "after_rename": [],
      "after_save": [],
      "after_upload": [],
      "before_copy": [],
      "before_delete": [],
      "before_rename": [],
      "before_save": [],
      "before_upload": []
    },
    "shell": [],
    "rules": []
  },
  "server": {
    "root": ".",
    "baseURL": "",
    "tlsKey": "",
    "tlsCert": "",
    "port": "8080",
    "address": "127.0.0.1",
    "log": "stdout"
  },
  "auther": {
    "recaptcha": null
  }
}

How to reproduce? I don’t know yet. There’s no specific pattern.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 26 (18 by maintainers)

Most upvoted comments

@chinanoahli thanks for the sample txt file. that’s very helpful.

 ll USA_Oregon_1.txt
-rw-r--r-- 1 hacklog hacklog 1052 Mar 23 13:46 USA_Oregon_1.txt

this file is UTF-8 nobomb and valid. nothing wrong.

finally , I’ve found the bug code on line 167 in files/file.go : https://github.com/filebrowser/filebrowser/blob/master/files/file.go#L167

func isBinary() is OK, but the call to it is wrong.

        buffer := make([]byte, 512)
	n, err := reader.Read(buffer)
//since our file size is 1052, so in this case, n = 512

//here is the problem: string(buffer[:n])
	case isBinary(string(buffer[:n])) || i.Size > 10*1024*1024: // 10 MB
		i.Type = "blob"
		return nil

we can simplify the code to `string(buffer[:512])``

I’ve wrote some test code to determine the problem:

package main

import (
	"fmt"
	"io/ioutil"
	"log"
)

func isBinary(content string) bool {
	for index, b := range content {
		log.Printf("start check index:%x, b: %x , %#U", index, b, b)
		// 65533 is the unknown char
		// 8 and below are control chars (e.g. backspace, null, eof, etc)
		if b <= 8 {
			log.Printf("b <= 8, index:%x, b: %x , %#U", index, b, b)
			return true
		}

		if b == 65533 {
			log.Printf("b == 65533, index:%x, %x , %#U", index, b, b)
			return true
		}

	}
	return false
}

func main() {
	filename := "./USA_Oregon_1.txt"
	c, _ := ioutil.ReadFile(filename)
//result: false
	fmt.Printf("isBinary(string(c)): %#v \n", isBinary(string(c)))

//result: true
	fmt.Printf("isBinary(string(c[:512])): %#v \n", isBinary(string(c[:512])))
}

image

index 1fe (decimal is 510, and 510%3 == 0), so, the last Chinese word should be U+95F4 '间', 95F4 is Hex code point the correct Hex UTF-8 bytes is E9 97 B4 (you can use this tool to convert http://www.ltg.ed.ac.uk/~richard/utf-8.cgi?input=95F4&mode=hex )

since the code only take 512 bytes, so it just took E9 97 and golang thinks this is not a valid Unicode code point, so it convert it to U+FFFD ( see golang doc: https://godoc.org/golang.org/x/text/encoding#Encoding) and FFFD actually is 65533 in func isBinary (in here, I suggest to use FFFD in func isBinary not 65533, because 65533 is hard to understand.)

according to golang doc:

Replacement is the replacement encoding. Decoding from the replacement encoding yields a single ‘\uFFFD’ replacement rune. Encoding from UTF-8 to the replacement encoding yields the same as the source bytes except that invalid UTF-8 is converted to ‘\uFFFD’.

It is defined at http://encoding.spec.whatwg.org/#replacement

@1138-4EB this fix (branch fix-isbinary) will not work as expected. becuase text files which non-UTF-8 encoded, will also result in “text” format, which makes filebrowser thinks this text file is editable.

for example. a GBK encoded srt file: image

this iconv comand can show it it a GBK file: image

and http.DetectContentType() will result in a wrong mime:

"text/plain; charset=utf-8"

Linux file command also returned a wrong type: Non-ISO extended-ASCII text image

The.Secret.in.Their.Eyes.2009.BluRay.1080p.x265.10bit.MNHD-FRDS.chs.zip

@chinanoahli you’d better attach your sample txt file to this issue, so that we can test it.