I have no idea either, I know I have the move of the canaries from one DO org to the other on my list but AFAIK that should not affect the functionality in any way.
It’s doing SOMETHING - since we got this last night
Which I - to be honest - don’t understand. It says there is a problem and then it recovered within 10 minutes. Now I know you’re fast Robert but I doubt that you fixed an issue within 10 minutes in the middle of the night?
Well it feels like an incomplete migration to me, and if @angus needs help to complete it we can jump in. Would really help to have a pointer or two on what the issue might be and where roughly.
PluginGuard::Status.update failed. Errors: Failed to post status to plugin manager: {"errors":["You are not permitted to view the requested resource. The API username or key is invalid."],"error_type":"invalid_access"}
Where is this message coming from? Check out lib/plugin_guard/status.rb. You’ll see where it posts status updates to the server and how it handles errors
unless response.status == 200
add_error("Failed to post status to plugin manager: #{response.body.to_s}")
return false
end
Then check out the plugin manager sever plugin status controller. You’ll see it requires an authorized API access (and has a descriptive error message)
unless is_api? || (is_user_api? && current_user.present?)
raise Discourse::InvalidAccess.new('plugin statuses can only be updated via authorized api requests')
end
Going back to the plugin guard code you’ll notice there’s one hidden site setting: plugin_manager_api_key for the purpose of running the guard on a server we control (i.e. a canary; the user api key would be used if an end user used the guard). There’s also a dedicated api key scope. Go to the old PMS server and you’ll notice there are two keys bearing the names of the two canaries.
Great, thanks for explaining Angus. I just noticed that tests-passed is now indeed working.
I will do the effort to make stable work later today or tomorrow.
Would you be able to shine your light on this as well?
and one more related request, sorry to keep bothering you, would we have a way to reroute these messages into another category? They are really cluttering the support categories.
The key piece of code to read to understand what is going on here is the status handler
The reason this specific phenomenon is happening is because this job runs every 10 mins
And currently the Custom Wizard CI is failing. It gets “resolved” because the tests passed canary (which I’ve been rebuilding) will post an “OK” status. This logic in the “status” helpers will treat that OK status as a resolution
def self.working?(status)
compatible?(status)
end
def self.not_working?(status)
incompatible?(status) || tests_failing?(status)
end
This should probably read
def self.working?(status)
compatible?(status) && !tests_failing?(status)
end
def self.not_working?(status)
incompatible?(status) || tests_failing?(status)
end
I’ll take a look at this underlying issue.
This is also why these issue reports have no details. Because the error details are not being picked up correctly from the CI error.
I’ll deploy it tonight because the PMS still requires a full site rebuild. That’s actually not really necessary anymore, and I’ll remove that special build setup from the PMS server soon (still necessary with the PMS guard).
There are still some tweaks to be made in the details of the error report, and maybe it needs a tag to show it’s a “tests failing” issue. But now when you see one you know what it means.
For the vast majority of the users, it would make sense to have the automated tag muted so they don’t see those topics at all. I’ve done this for myself as they are unhelpful for me - and my experience has improved markedly!
I wonder if it would be best to have those muted by default, but ‘normalled’ for @plugin_admins.
@richard FYI, as I’m guessing this will be the next question
The events plugin status was tests failing because CI was failing. This is why the automated issue topic was created.
CI is now passing, so the “test status” for the plugin is no longer failing, however since that switch the PMS has yet to receive a “Compatible” status from the tests-passed canary
A “Compatible” status on the PMS requires BOTH a compatible ping from the relevant canary and tests passing. The compatible status update currently must come while tests are passing.
So the status will be “Unknown” until the tests passed canary sends a compatible status.