site-kit-wp: Show placeholder text when author or category is not found in WordPress

Feature Description

In some rare cases, the display name for an author or category ID that has been tracked in Analytics could “not be found” in the WordPress database. This could be because the Author or Category was deleted since it was tracked.

Screenshot 2023-10-24 at 16 07 30

Note: This is for the authors and categories where an ID number is being displayed. We are able to differentiate between a category or ID that is an actual integer and an integer ID.

– Updates until 8 Nov 2023 –

The news team suggested we don’t use IDs only since this data is not very useful outside of Site Kit - a consideration we never had to make before.

After a lot of discussion in Slack (thread, thread, thread) we decided to use usernames for Authors since they are immutable and slug-names for categories which would change less often than their display names. The downside here is that if any of these are modified, we won’t be able to decipher the modifications e.g. categoryA is changed to categoryZ, we will track and display page views for both categories separately.

An improvement here could be to track IDs as well - however, having them in the existing custom dimensions would mean having two pieces of data in a single custom dimension. This would be hard to report on.

We could potentially track IDs of authors and categories as separate custom dimensions. However, currently, if a page URL changes, we don’t go to extreme lengths to preserve the GA metrics across all versions of a pages’s URL that it may have tracked data for. So we find no real need to track IDs for such edge cases.

– Update: 9th Nov 2023 –

In the end, tracking author’s usernames could not be very user friendly within the Analytics Admin Dashboard (and within Site Kit) as journalists might not use their full names within their username which is usually what is displayed to the public on a post and is user friendly to see in a report. So we will stick to tracking and displaying author’s display names.

Similarly, instead of tracking slug names, we decided to track the proper Category names.

We have intentionally decided not to worry about the edge case where an author or category name is modified since it was tracked. Firstly, these are edge cases. The user can still create an aggregate report knowing what they’ve modified. And also, for a specific report duration (i.e. 28 days), the difference between the tracked name and the modified name will rarely be an issue.


Do not alter or remove anything below. The following sections will be managed by moderators only.

Acceptance criteria

  • For the googlesitekit_post_author custom dimension, track the post author’s display_name when sending data to GA and when displaying authors in the Top Authors KMW tile instead of the author ID.
    • If the display name is not set, we should fall back to using the username (if WP does not do this by default).
  • For the googlesitekit_post_categories custom dimension, track a list of category name when sending data to GA and when displaying them in the Top Categories KMW tile instead of the list of category IDs.
    • Each category should be split using ; . E.g. Oranges, Pears, and others; Vegetables; Gardening

Implementation Brief

  • In includes/Modules/Analytics_4.php, modify the get_custom_dimensions_data() method:

    • For the googlesitekit_post_author case:
      • Pass the $post->post_author to the get_userdata() function to fetch the $user object.
      • Set $data[ $custom_dimension ] to be $user->display_name. If the $user->display_name is empty, fall back on $user->user_login.
      • Similar to the recent fix for blank categories, if there is no post author ID or user found for the author ID (some manual database corruption), then do not set this custom dimension.
    • For the googlesitekit_post_categories case:
      • Using wp_list_pluck(), pluck the name property within $categories.
      • Set $data[ $custom_dimension ] to use implode - however, use ; (semicolon with a space) as the separator.
  • In includes/Modules/Analytics_4/Report/Response.php:

    • Remove the lines in parse_response which uses the Custom_Dimensions_Response_Parser class to modify the $response object.
  • Remove the Custom_Dimensions_Response_Parser class and its associated tests.

  • In assets/js/modules/analytics-4/components/widgets/PopularAuthorsWidget.js:

    • Modify the dimensionFilters as per the IB of #7737 to create a not expression that matches exactly (not set) rather than digits only.

Test Coverage

  • Update Analytics_4Test::test_get_custom_dimensions_data() to match the above.

QA Brief

  • On a test site, connect Site Kit to GA4 with a property that has no old custom dimensions data.
  • Check the Google Analytics snippet added by Site Kit on the front end, i.e. the Post pages which have an author and some categories. Verify the gtag JS config has names now as per the AC for the above custom dimensions instead of IDs.
  • Using the same test site which now tracks authors and categories with names (instead of IDs), wait a day and check again the next day to ensure the categories and authors still display names in their respective TopCategories and PopularAuthors KMW tiles as before despite the change in the tracking format.
  • Verify that there are no (not set) records appearing for the Popular Authors widget tile. (As is the case already - but ensure a regression isn’t added here.) Also test the same for the Popular Categories widget tile.

Additional QAB for the followup PR:

  • Connect Site Kit to a GA4 property that doesn’t have the custom dimensions setup.
  • Create the custom dimensions via the Key Metrics tiles.
  • Verify the custom dimension descriptions are created in Analytics as follows:
    • googlesitekit_post_date: Created by Site Kit: Date when a post was created
    • googlesitekit_post_author: Created by Site Kit: WordPress name of the post author
    • googlesitekit_post_categories: Created by Site Kit: Names of categories assigned to posts
    • googlesitekit_post_type: Created by Site Kit: Content type of posts
  • Verify the custom dimension names and descriptions are created in English regardless of the language settings in WordPress.

Changelog entry

  • Track author and category names rather than IDs for the relevant custom dimensions, and display as they are in their corresponding Key Metrics widgets.

About this issue

  • Original URL
  • State: closed
  • Created 8 months ago
  • Comments: 22

Most upvoted comments

Do we not also need to modify the dimensionFilters of TopCategoriesWidget then as well? Or is this being handled in a separate issue?

#7737 is taking care of this for the TopCategoriesWidget specifically. I might end up changing things slightly compared to the IB as per my slack message earlier today.

Regarding the author dimension data, I don’t see the fallback accounted for which is mentioned in the AC:

Have updated the IB to use user_login as a fallback. (I thought WordPress defaulted the display name property to the username but it doesn’t.)

AC ✅

This change will make the ID parsing of the report for these dimensions unnecessary (no longer used) so we might want to open a follow-up to remove this but we could also include it in the IB for this one as it’s probably quite straightforward. Later if we decide to do this again for whatever reason we can always pull code back from the history rather than keep it around as unused.

@sigal-teller We discussed this issue on our last AC Sync and felt your input could help on how to display these, e.g. show “author ID x” with a tooltip that says the author ID does not have a display name.

I would probably hold on this issue for now until @marrrmarrr comes back as the News team had some feedback on sending the string display names directly instead of IDs to GA so that users can see the categories and author names directly in Analytics as well and don’t have to rely on Site Kit to display this data “meaningfully”.