yt-dlp: Argument of type 'NoneType' is not iterable while running on Docker/airflow

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

  • I understand that I will be blocked if I intentionally remove or skip any mandatory* field

Checklist

  • I’m reporting a bug unrelated to a specific site
  • I’ve verified that I have updated yt-dlp to nightly or master (update instructions)
  • I’ve checked that all provided URLs are playable in a browser with the same IP and same login details
  • I’ve checked that all URLs and arguments with special characters are properly quoted or escaped
  • I’ve searched known issues and the bugtracker for similar issues including closed ones. DO NOT post duplicates
  • I’ve read the guidelines for opening an issue

Provide a description that is worded well enough to be understood

Extracting channel metadata like this:

channel_url = f'https://www.youtube.com/channel/{channel_id}'
with YoutubeDL() as ydl:
    result = ydl.extract_info(channel_url, download=False)
    logging.info("Youtube post extraction data %s", result)

====== In local this code is working fine able to get the results. As soon as I started running through docker/airflow. I started getting this error. It is not even going on logging.info and throwing this error:

[2023-12-20, 20:26:54 UTC] {logging_mixin.py:137} INFO - offset_val**: 2023-12-08T22:51:16.332110Z
[2023-12-20, 20:26:54 UTC] {youtube.py:285} INFO - youtube_creator_offset_key: youtube_data_ingestion_28020_offset, offset_val in cache: 2023-12-08T22:51:16.332110Z, published_after: 2023-12-08T22:51:16.332110Z
[2023-12-20, 20:26:54 UTC] {youtube.py:815} ERROR - argument of type 'NoneType' is not iterable
Traceback (most recent call last):
  File "/usr/local/airflow/dags/tasks/brandsafety/openreport/youtube.py", line 766, in process_channel_videos
    with YoutubeDL(ydl_opts) as ydl:
  File "/usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 681, in __init__
    self.print_debug_header()
  File "/usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 3922, in print_debug_header
    write_string(f'[Invalid date] {encoding_str}\n', encoding=None)
  File "/usr/local/lib/python3.9/site-packages/yt_dlp/utils/_utils.py", line 1429, in write_string
    if 'b' in getattr(out, 'mode', ''):
TypeError: argument of type 'NoneType' is not iterable
[2023-12-20, 20:26:55 UTC] {youtube.py:298} ERROR - ('Error while scraping youtube posts: %s', 'UCAcr6uli4eZhYKyMrKPB87A')
Traceback (most recent call last):
  File "/usr/local/airflow/dags/tasks/brandsafety/openreport/youtube.py", line 766, in process_channel_videos
    with YoutubeDL(ydl_opts) as ydl:
  File "/usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 681, in __init__
    self.print_debug_header()
  File "/usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 3922, in print_debug_header
    write_string(f'[Invalid date] {encoding_str}\n', encoding=None)
  File "/usr/local/lib/python3.9/site-packages/yt_dlp/utils/_utils.py", line 1429, in write_string
    if 'b' in getattr(out, 'mode', ''):
TypeError: argument of type 'NoneType' is not iterable

Provide verbose output that clearly demonstrates the problem

  • Run your yt-dlp command with -vU flag added (yt-dlp -vU <your command line>)
  • If using API, add 'verbose': True to YoutubeDL params instead
  • Copy the WHOLE output (starting with [debug] Command-line config) and insert it below

Complete Verbose Output

*** Reading local file: /usr/local/airflow/logs/dag_id=youtube_data_ingestion_dag/run_id=manual__2023-12-20T20:21:40.071276+00:00/task_id=fetch_and_process_social_content/map_index=0/attempt=2.log
[2023-12-20, 20:26:54 UTC] {taskinstance.py:1165} INFO - Dependencies all met for <TaskInstance: youtube_data_ingestion_dag.fetch_and_process_social_content manual__2023-12-20T20:21:40.071276+00:00 map_index=0 [queued]>
[2023-12-20, 20:26:54 UTC] {taskinstance.py:1165} INFO - Dependencies all met for <TaskInstance: youtube_data_ingestion_dag.fetch_and_process_social_content manual__2023-12-20T20:21:40.071276+00:00 map_index=0 [queued]>
[2023-12-20, 20:26:54 UTC] {taskinstance.py:1362} INFO - 
--------------------------------------------------------------------------------
[2023-12-20, 20:26:54 UTC] {taskinstance.py:1363} INFO - Starting attempt 2 of 2
[2023-12-20, 20:26:54 UTC] {taskinstance.py:1364} INFO - 
--------------------------------------------------------------------------------
[2023-12-20, 20:26:54 UTC] {taskinstance.py:1383} INFO - Executing <Mapped(_PythonDecoratedOperator): fetch_and_process_social_content> on 2023-12-20 20:21:40.071276+00:00
[2023-12-20, 20:26:54 UTC] {standard_task_runner.py:55} INFO - Started process 1860 to run task
[2023-12-20, 20:26:54 UTC] {standard_task_runner.py:82} INFO - Running: ['airflow', 'tasks', 'run', 'youtube_data_ingestion_dag', 'fetch_and_process_social_content', 'manual__2023-12-20T20:21:40.071276+00:00', '--job-id', '140', '--raw', '--subdir', 'DAGS_FOLDER/data_ingestion_dag.py', '--cfg-path', '/tmp/tmpbt7fbvi3', '--map-index', '0']
[2023-12-20, 20:26:54 UTC] {standard_task_runner.py:83} INFO - Job 140: Subtask fetch_and_process_social_content
[2023-12-20, 20:26:54 UTC] {task_command.py:376} INFO - Running <TaskInstance: youtube_data_ingestion_dag.fetch_and_process_social_content manual__2023-12-20T20:21:40.071276+00:00 map_index=0 [running]> on host 7d8ef6825cf3
[2023-12-20, 20:26:54 UTC] {taskinstance.py:1590} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=youtube_data_ingestion_dag
AIRFLOW_CTX_TASK_ID=fetch_and_process_social_content
AIRFLOW_CTX_EXECUTION_DATE=2023-12-20T20:21:40.071276+00:00
AIRFLOW_CTX_TRY_NUMBER=2
AIRFLOW_CTX_DAG_RUN_ID=manual__2023-12-20T20:21:40.071276+00:00
[2023-12-20, 20:26:54 UTC] {youtube.py:272} INFO - dag_id: youtube_data_ingestion_dag, Starting YouTube posts ingestion for the viral_nation_id: 28020
[2023-12-20, 20:26:54 UTC] {logging_mixin.py:137} INFO - offset_val**: 2023-12-08T22:51:16.332110Z
[2023-12-20, 20:26:54 UTC] {youtube.py:285} INFO - youtube_creator_offset_key: youtube_data_ingestion_28020_offset, offset_val in cache: 2023-12-08T22:51:16.332110Z, published_after: 2023-12-08T22:51:16.332110Z
[2023-12-20, 20:26:54 UTC] {youtube.py:815} ERROR - argument of type 'NoneType' is not iterable
Traceback (most recent call last):
  File "/usr/local/airflow/dags/tasks/brandsafety/openreport/youtube.py", line 766, in process_channel_videos
    with YoutubeDL(ydl_opts) as ydl:
  File "/usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 681, in __init__
    self.print_debug_header()
  File "/usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 3922, in print_debug_header
    write_string(f'[Invalid date] {encoding_str}\n', encoding=None)
  File "/usr/local/lib/python3.9/site-packages/yt_dlp/utils/_utils.py", line 1429, in write_string
    if 'b' in getattr(out, 'mode', ''):
TypeError: argument of type 'NoneType' is not iterable
[2023-12-20, 20:26:55 UTC] {youtube.py:298} ERROR - ('Error while scraping youtube posts: %s', 'UCAcr6uli4eZhYKyMrKPB87A')
Traceback (most recent call last):
  File "/usr/local/airflow/dags/tasks/brandsafety/openreport/youtube.py", line 766, in process_channel_videos
    with YoutubeDL(ydl_opts) as ydl:
  File "/usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 681, in __init__
    self.print_debug_header()
  File "/usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 3922, in print_debug_header
    write_string(f'[Invalid date] {encoding_str}\n', encoding=None)
  File "/usr/local/lib/python3.9/site-packages/yt_dlp/utils/_utils.py", line 1429, in write_string
    if 'b' in getattr(out, 'mode', ''):
TypeError: argument of type 'NoneType' is not iterable

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/airflow/dags/tasks/brandsafety/openreport/youtube.py", line 292, in youtube_fetch_social_content_using_scraping
    process_channel_videos(viral_nation_id, 'UCAcr6uli4eZhYKyMrKPB87A', published_after, user_to_process)
  File "/usr/local/airflow/dags/tasks/brandsafety/openreport/youtube.py", line 818, in process_channel_videos
    raise Exception(
Exception: ('Error while scraping youtube posts: %s', 'UCAcr6uli4eZhYKyMrKPB87A')
[2023-12-20, 20:26:55 UTC] {taskinstance.py:1851} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/usr/local/airflow/dags/tasks/brandsafety/openreport/youtube.py", line 766, in process_channel_videos
    with YoutubeDL(ydl_opts) as ydl:
  File "/usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 681, in __init__
    self.print_debug_header()
  File "/usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 3922, in print_debug_header
    write_string(f'[Invalid date] {encoding_str}\n', encoding=None)
  File "/usr/local/lib/python3.9/site-packages/yt_dlp/utils/_utils.py", line 1429, in write_string
    if 'b' in getattr(out, 'mode', ''):
TypeError: argument of type 'NoneType' is not iterable

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/airflow/dags/tasks/brandsafety/openreport/youtube.py", line 292, in youtube_fetch_social_content_using_scraping
    process_channel_videos(viral_nation_id, 'UCAcr6uli4eZhYKyMrKPB87A', published_after, user_to_process)
  File "/usr/local/airflow/dags/tasks/brandsafety/openreport/youtube.py", line 818, in process_channel_videos
    raise Exception(
Exception: ('Error while scraping youtube posts: %s', 'UCAcr6uli4eZhYKyMrKPB87A')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/airflow/decorators/base.py", line 188, in execute
    return_value = super().execute(context)
  File "/usr/local/lib/python3.9/site-packages/airflow/operators/python.py", line 175, in execute
    return_value = self.execute_callable()
  File "/usr/local/lib/python3.9/site-packages/airflow/operators/python.py", line 193, in execute_callable
    return self.python_callable(*self.op_args, **self.op_kwargs)
  File "/usr/local/airflow/dags/data_ingestion_dag.py", line 75, in fetch_and_process_social_content
    fetch_and_process_social_content_data(dag, dag_id, configs,
  File "/usr/local/airflow/dags/tasks/brandsafety/openreport/processor/fetch_social_content_processor.py", line 50, in fetch_and_process_social_content_data
    youtube_fetch_social_content_using_scraping(dag, dag_id, configs, job_type_cfg, influencer)
  File "/usr/local/airflow/dags/tasks/brandsafety/openreport/youtube.py", line 299, in youtube_fetch_social_content_using_scraping
    raise Exception(
Exception: Error while fetching video posts jsons
[2023-12-20, 20:26:55 UTC] {taskinstance.py:1401} INFO - Marking task as FAILED. dag_id=youtube_data_ingestion_dag, task_id=fetch_and_process_social_content, map_index=0, execution_date=20231220T202140, start_date=20231220T202654, end_date=20231220T202655
[2023-12-20, 20:26:55 UTC] {standard_task_runner.py:100} ERROR - Failed to execute job 140 for task fetch_and_process_social_content (Error while fetching video posts jsons; 1860)
[2023-12-20, 20:26:55 UTC] {local_task_job.py:159} INFO - Task exited with return code 1
[2023-12-20, 20:26:55 UTC] {taskinstance.py:2623} INFO - 0 downstream tasks scheduled from follow-on schedule check

About this issue

  • Original URL
  • State: closed
  • Created 6 months ago
  • Reactions: 1
  • Comments: 32 (20 by maintainers)

Commits related to this issue

Most upvoted comments

I stand corrected by the CPython source code:

class IO(Generic[AnyStr]):
    @property
    @abstractmethod
    def mode(self) -> str:
        pass

Theoretically it should be required to override the mode property to return a str since it is abstract. However, that doesn’t seem to be the case in practice, so this is indeed an external issue. The ordering of the subclasses in airflow matters: Subclassing IOBase, typing.IO[str] does not throw a type error (abstract) using mypy, while subclassing typing.IO[str], IOBase does.

This makes me think that our code behaves as expected and the issue should be raised on the airflow repo. I will still commit the workaroud patch soon™

@botsmaster Your comment is not related to this in any way. Open a new issue with appropriate verbose log