soupsieve: the :not selector don't work as expected.

the minimal code which can reproduce the bug lists below

import bs4
b = bs4.BeautifulSoup("<a href=\"http://www.example.com\"></a>") 
b.body.a['foo'] = None  # str(b) ->  <html><body><a foo href="http://www.example.com"></a></body></html>
b.select("a:not([foo])")  # -> [<a foo href="http://www.example.com"></a>]

in this case, the tag a shouldn’t be selected.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 16 (13 by maintainers)

Commits related to this issue

Most upvoted comments

Sounds like he’s open to trying this out on the default HTML5 formatter, so that is a start. I’ll try to get a pull in there so at least on the HTML5 formatter, empty strings should output as bare attributes.

In general, I guess now None will work, so you can keep doing that. Generally, I don’t think people should use None and this should be changed in the HTML formatter to output bare attributes. I think overriding the HTML formatter is probably a cleaner option to do this on output, especially if the author doesn’t want to change the output in HTML moving forward (XML should not do this though).

I guess we’ll see what he decides. But I guess you have options now.

Not a huge fan of having to handle these weird cases, but I can see BS really gets users accustomed to the idea that they can put anything into attributes and it should still come out reasonable. It turned out that it was quite trivial to do due to the way we wrote things, so it all worked out in the end 🙂.

Pull #213 will fix this issue.

@gir-bot remove S: wontfix @gir-bot add T: bug

I think there is sufficient reasoning to assert there is nothing to fix here. I am aware of no cases that BS ever inserts None as an attribute value itself (though I am open to being proven wrong 🙂). As things stand now, I am flagging this as a wontfix. Maybe there is some case I am overlooking?

@gir-bot remove S: triage @gir-bot add S: wontfix