sqlparse: Inconsistent behavior of self.parent (and has_ancestor())

(python 3.4, sqlparse 0.1.18) The sample code is follows:

import sqlparse
sql = "CREATE TABLE test();"
result = sqlparse.parse(sql)
for item in result[0].tokens:
    print("Type: %s, ttype: %s, Parent: %s, has_ancestor: %s" % \
        (type(item), item.ttype, item.parent, item.has_ancestor(result[0])))

I’ve expected that all of results have has_ancestor: True, but only one has such result.

Type: <class 'sqlparse.sql.Token'>, ttype: Token.Keyword.DDL, Parent: None, has_ancestor: False
Type: <class 'sqlparse.sql.Token'>, ttype: Token.Text.Whitespace, Parent: None, has_ancestor: False
Type: <class 'sqlparse.sql.Token'>, ttype: Token.Keyword, Parent: None, has_ancestor: False
Type: <class 'sqlparse.sql.Token'>, ttype: Token.Text.Whitespace, Parent: None, has_ancestor: False
Type: <class 'sqlparse.sql.Function'>, ttype: None, Parent: CREATE TABLE test();, has_ancestor: True
Type: <class 'sqlparse.sql.Token'>, ttype: Token.Punctuation, Parent: None, has_ancestor: False

Due to such behavior, it is not easy to traverse items in the parsed statement.

About this issue

Original URL
State: closed
Created 8 years ago
Comments: 15 (11 by maintainers)

Commits related to this issue

Fix token-parent behavior Closes issue #226 — committed to andialbrecht/sqlparse by vmuriart 8 years ago
Implement ASSERT statement (#226) As supported by PostgreSQL and BigQuery (with some differences between them) — committed to cube-js/sqlparse by Dandandan 4 years ago

Most upvoted comments

@bc-lee it suddenly hit me what you meant with this. I made and necessary corrections. Below is the new output. Please met us know if this didn’t fix it, and re-open the issue.

from __future__ import print_function
import sqlparse


sql = "CREATE TABLE test();"
stmt = sqlparse.parse(sql)[0]

for item in stmt.tokens:
    print("Type: {:34} ttype: {:21} Parent: {:20} has_ancestor: {}".format(
        type(item), item.ttype, item.parent, item.has_ancestor(stmt)))

Type: <class 'sqlparse.sql.Token'>       ttype: Token.Keyword.DDL     Parent: CREATE TABLE test(); has_ancestor: True
Type: <class 'sqlparse.sql.Token'>       ttype: Token.Text.Whitespace Parent: CREATE TABLE test(); has_ancestor: True
Type: <class 'sqlparse.sql.Token'>       ttype: Token.Keyword         Parent: CREATE TABLE test(); has_ancestor: True
Type: <class 'sqlparse.sql.Token'>       ttype: Token.Text.Whitespace Parent: CREATE TABLE test(); has_ancestor: True
Type: <class 'sqlparse.sql.Identifier'>  ttype: None                  Parent: CREATE TABLE test(); has_ancestor: True
Type: <class 'sqlparse.sql.Parenthesis'> ttype: None                  Parent: CREATE TABLE test(); has_ancestor: True
Type: <class 'sqlparse.sql.Token'>       ttype: Token.Punctuation     Parent: CREATE TABLE test(); has_ancestor: True

vmuriart on Jun 12, 2016

I personally think the xml api in python is unsavory. bs4’s api would probably be much nicer (and more pythonic maybe).

Here is a snippet of the bs4 doc’s nav:

Navigating the tree
  Going down
    Navigating using tag names
    .contents and .children
    .descendants
    .string
    .strings and stripped_strings
  Going up
    .parent
    .parents
  Going sideways
    .next_sibling and .previous_sibling
    .next_siblings and .previous_siblings
  Going back and forth
    .next_element and .previous_element
    .next_elements and .previous_elements

These properties sound pretty reasonable to me, with the possible exception of the navigation by tag name, which I imagine for sql could look something like:

query.where[2].orderby

Which might evaluate as the third where clause’s order keyword object (ASC, DESC).

But I hesitate to say its something to demand as I’m not sure sql’s structure lends to consistent name navigation; but the sql parser engine author could probably speak better to that effect.

Bs4 also declares the .string property to be the canonical way to get a (unicode) object of the node’s text. That may conflict with the existing api which IIRC has a getter/property named something verbose.

:twocents:

thorsummoner on May 31, 2016