ox: Incorrect HTML dumped after being parsed
Dumping HTML data after parsing it via Ox.parse results in incorrect HTML
Ox::VERSION == "2.8.4"
require 'ox'
html = <<-HTML
<!DOCTYPE html >
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<title>Hello World</title>
</head>
<body>
<h1>Hello World</h1>
<p>Lorem Ipsum Dolor Sit</p>
</body>
</html>
HTML
Ox.default_options = {
mode: :generic,
effort: :tolerant,
smart: true
}
puts Ox.dump(Ox.parse(html))
The output being:
<!DOCTYPE html >
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<title>Hello World</title>
</meta>
</meta>
</head>
<body>
<h1>Hello World</h1>
<p>Lorem Ipsum Dolor Sit</p>
</body>
</html>
Either this is a bug or there’s documentation missing on how to parse, alter and re-constitute HTML like Nokogiri…
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 16 (9 by maintainers)
So Ox would have to support HTML using the
Ox.parseandOx.dumpmethod. Something to put on the requested feature list. I can see how it involves fewer steps even if the SAX/Builder combo has the potential to be a lot faster.You know my weakness, benchmarks. Now I have to put together something. 😊
Can you give an example of what you mean by tags?
I suppose a set of examples with comments might be helpful. There are many use cases.