site-kit-wp: Show placeholder text when author or category is not found in WordPress
Feature Description
In some rare cases, the display name for an author or category ID that has been tracked in Analytics could “not be found” in the WordPress database. This could be because the Author or Category was deleted since it was tracked.
Note: This is for the authors and categories where an ID number is being displayed. We are able to differentiate between a category or ID that is an actual integer and an integer ID.
– Updates until 8 Nov 2023 –
The news team suggested we don’t use IDs only since this data is not very useful outside of Site Kit - a consideration we never had to make before.
After a lot of discussion in Slack (thread, thread, thread) we decided to use usernames
for Authors since they are immutable and slug-names for categories which would change less often than their display names. The downside here is that if any of these are modified, we won’t be able to decipher the modifications e.g. categoryA is changed to categoryZ, we will track and display page views for both categories separately.
An improvement here could be to track IDs as well - however, having them in the existing custom dimensions would mean having two pieces of data in a single custom dimension. This would be hard to report on.
We could potentially track IDs of authors and categories as separate custom dimensions. However, currently, if a page URL changes, we don’t go to extreme lengths to preserve the GA metrics across all versions of a pages’s URL that it may have tracked data for. So we find no real need to track IDs for such edge cases.
– Update: 9th Nov 2023 –
In the end, tracking author’s usernames
could not be very user friendly within the Analytics Admin Dashboard (and within Site Kit) as journalists might not use their full names within their username which is usually what is displayed to the public on a post and is user friendly to see in a report. So we will stick to tracking and displaying author’s display names.
Similarly, instead of tracking slug names, we decided to track the proper Category names.
We have intentionally decided not to worry about the edge case where an author or category name is modified since it was tracked. Firstly, these are edge cases. The user can still create an aggregate report knowing what they’ve modified. And also, for a specific report duration (i.e. 28 days), the difference between the tracked name and the modified name will rarely be an issue.
Do not alter or remove anything below. The following sections will be managed by moderators only.
Acceptance criteria
- For the
googlesitekit_post_author
custom dimension, track the post author’sdisplay_name
when sending data to GA and when displaying authors in theTop Authors
KMW tile instead of the author ID.- If the display name is not set, we should fall back to using the username (if WP does not do this by default).
- For the
googlesitekit_post_categories
custom dimension, track a list of categoryname
when sending data to GA and when displaying them in theTop Categories
KMW tile instead of the list of category IDs.- Each category should be split using
;
. E.g.Oranges, Pears, and others; Vegetables; Gardening
- Each category should be split using
Implementation Brief
-
In
includes/Modules/Analytics_4.php
, modify theget_custom_dimensions_data() method
:- For the
googlesitekit_post_author
case:- Pass the
$post->post_author
to theget_userdata()
function to fetch the$user
object. - Set
$data[ $custom_dimension ]
to be$user->display_name
. If the$user->display_name
is empty, fall back on$user->user_login
. - Similar to the recent fix for blank categories, if there is no post author ID or user found for the author ID (some manual database corruption), then do not set this custom dimension.
- Pass the
- For the
googlesitekit_post_categories
case:- Using wp_list_pluck(), pluck the
name
property within$categories
. - Set
$data[ $custom_dimension ]
to use implode - however, use;
(semicolon with a space) as the separator.
- Using wp_list_pluck(), pluck the
- For the
-
In
includes/Modules/Analytics_4/Report/Response.php
:- Remove the lines in
parse_response
which uses theCustom_Dimensions_Response_Parser
class to modify the$response
object.
- Remove the lines in
-
Remove the
Custom_Dimensions_Response_Parser
class and its associated tests. -
In
assets/js/modules/analytics-4/components/widgets/PopularAuthorsWidget.js
:- Modify the
dimensionFilters
as per the IB of #7737 to create a not expression that matches exactly(not set)
rather than digits only.
- Modify the
Test Coverage
- Update
Analytics_4Test::test_get_custom_dimensions_data()
to match the above.
QA Brief
- On a test site, connect Site Kit to GA4 with a property that has no old custom dimensions data.
- Check the Google Analytics snippet added by Site Kit on the front end, i.e. the Post pages which have an author and some categories. Verify the gtag JS config has names now as per the AC for the above custom dimensions instead of IDs.
- Using the same test site which now tracks authors and categories with names (instead of IDs), wait a day and check again the next day to ensure the categories and authors still display names in their respective TopCategories and PopularAuthors KMW tiles as before despite the change in the tracking format.
- Verify that there are no
(not set)
records appearing for the Popular Authors widget tile. (As is the case already - but ensure a regression isn’t added here.) Also test the same for the Popular Categories widget tile.
Additional QAB for the followup PR:
- Connect Site Kit to a GA4 property that doesn’t have the custom dimensions setup.
- Create the custom dimensions via the Key Metrics tiles.
- Verify the custom dimension descriptions are created in Analytics as follows:
googlesitekit_post_date
: Created by Site Kit: Date when a post was createdgooglesitekit_post_author
: Created by Site Kit: WordPress name of the post authorgooglesitekit_post_categories
: Created by Site Kit: Names of categories assigned to postsgooglesitekit_post_type
: Created by Site Kit: Content type of posts
- Verify the custom dimension names and descriptions are created in English regardless of the language settings in WordPress.
Changelog entry
- Track author and category names rather than IDs for the relevant custom dimensions, and display as they are in their corresponding Key Metrics widgets.
About this issue
- Original URL
- State: closed
- Created 8 months ago
- Comments: 22
#7737 is taking care of this for the
TopCategoriesWidget
specifically. I might end up changing things slightly compared to the IB as per my slack message earlier today.Have updated the IB to use
user_login
as a fallback. (I thought WordPress defaulted the display name property to the username but it doesn’t.)AC ✅
This change will make the ID parsing of the report for these dimensions unnecessary (no longer used) so we might want to open a follow-up to remove this but we could also include it in the IB for this one as it’s probably quite straightforward. Later if we decide to do this again for whatever reason we can always pull code back from the history rather than keep it around as unused.
@sigal-teller We discussed this issue on our last AC Sync and felt your input could help on how to display these, e.g. show “author ID x” with a tooltip that says the author ID does not have a display name.
I would probably hold on this issue for now until @marrrmarrr comes back as the News team had some feedback on sending the string display names directly instead of IDs to GA so that users can see the categories and author names directly in Analytics as well and don’t have to rely on Site Kit to display this data “meaningfully”.