Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue: Self-reporting to collect real-world statistics #2909

Closed
12 of 27 tasks
Hocuri opened this issue Dec 28, 2023 · 4 comments
Closed
12 of 27 tasks

Tracking issue: Self-reporting to collect real-world statistics #2909

Hocuri opened this issue Dec 28, 2023 · 4 comments
Assignees

Comments

@Hocuri
Copy link
Collaborator

Hocuri commented Dec 28, 2023

I'll update the issue description here whenever there is some new discussion on the topic. Below we can have an "unordered" discussion.

Goal: Collect realworld statistics on usage of verified chats and other parts of Delta Chat by allowing the user to send us some anonymized info. Not only will this be be opt-in, but also the user will be able to see what will be sent, and control when it is sent.

Timeline: Feb 2024

  • Implement an opt-in self-reporting mechanism for verified chat usage on Android (UI, Core)
  • Implement a data-collection bot that stores anonymized user-reports for inspection and analysis by developers. (Deploying PR)
    • Include the bot's public key as "verified" in DC Core, so that this works with chatmail
      • Maybe we should fixate the private key on the bot-side and import it on deploy, so that even a fresh setup works and keeps the key
  • Refine which metrics actually are interesting to us
  • Possibly change the name from self_reporting_bot to statsbot
  • Possibly send the statistics in an attachment instead of message text (see below)
  • Release and deploy privacy-preserving data collection bot
  • Release self-reporting mechanism on Android
  • List of prioritized verified chat issues obtained from field data

Metrics I think will be interesting

Can be extracted from the db today

I checked the boxes of whatever metrics I already implemented.

  • Key creation date
  • #messages
  • #chats
  • #chats using verified, opportunistic, or no encryption
  • #chats with dc/non-dc users
  • Android system version
  • Core version
  • Android UI version
  • DB size
  • Anything else that's shown at the beginning of the log already

Additional metrics we could collect during usage

Note that before implementing these, we'll have to check whether having this information on the phone could be dangerous, e.g. maybe having created a lots of groups could make you suspicious, since you're probably some sort of leader? Probably I'm overthinking it, though.

  • #failed QR Code Scans / securejoin (-> is this a realword problem?)
  • #not decryptable messages (-> is this a realword problem?)
  • #failed decryptions (-> is this a realword problem?)
  • #created groups (-> how many people create any groups at all)
  • A unique id (-> To recognize the user again without needing to store the email address)
  • how often did (verified) keys of chat partners change
  • A history of all these data (how it evolved over time)

UX details

  • In the Advanced settings, there will be a new button "Send statistics to Delta Chat's developers".
    • To be implemented later: Also, after a user sent&received lots of messages, send a device message that asks the user to do the same
  • If the user clicks it, a draft message will be sent into a chat with the bot. The user has to manually click "Send" to actually send it.
    • Open question: Should the statistics be in an attachment or in the message text? Pro attachment: The message text can be some human-readable explanation what this message is about; it doesn't clutter the chat; email providers sometimes mess up the message text; doesn't confuse users who don't understand the metrics. Pro message text: Easier to see what's actually sent (1 step less); the user may not be able to open the attachment (idk if it's possible to open a draft attachment on all platforms by clicking on it)
  • A bot will listen to the attachment, and it will be visualized. At this point I'll need some help since I don't have any experience with this.

Technical details

Use OpenMetrics as the data format and Prometheus to visualize them. To create the file, format strings are probably enough; if not, https://lib.rs/keywords/openmetrics (or https://crates.io/search?q=openmetrics) may be interesting.

Alternatives

  • Make it a setting and automatically send the statistics if enabled.
@hpk42
Copy link
Contributor

hpk42 commented Jan 23, 2024

The use of Prometheus is not.mandatory, just a current tool we use fwiw. Prometheus has the issue that it mandates a pull model, it wants to pull metrics from (cloud) servers. For this issue and also for chatmail measurements we rather need a push model and using email for that is natural. Data an then be made available to Prometheus or some of us can try to render something directly out of the data. I don't think there needs to be a very customizable UI for the visualization but if it comes "for free" then why not.

@link2xt
Copy link
Contributor

link2xt commented Jan 25, 2024

May be interesting to collect:

  • number of different domains for verified contacts
  • number of directly verified contacts
  • number of non-directly verified contacts.

@Hocuri
Copy link
Collaborator Author

Hocuri commented Jan 27, 2024

@link2xt Why are they interesting?

At some point we'll need a discussion about which metrics are worth having, so I'll start (ping @adbenitez who proposed most of the metrics below).

The reason to collect data is that we want to know which usability problems to prioritize, and to help us with some future decisions. I put a "?" to all the metrics where I can't think of any decision where this metric might be useful, but maybe there is some?

number of messages
?
number of chats
?
number of different domains for verified contacts
?
number of directly verified contacts
?
number of non-directly verified contacts.
?
What "email displaying mode" user use (chats-only, "all" etc)
Contra: I don't think we'll change this anytime soon, regardless of what the metrics look like.

Number of accounts configured
Pro: This will show us whether to prioritize multi-account.
Contra: multi-account won't go away, and I don't see us prioritize it a lot more or less than right now.

Max number of members in a group
?
Inbox quota size
?
Number of messages per month or day
?

Feel free to directly edit this comment here with pro and con arguments. Also, I didn't include most of the metrics that are already in my first account, because most of them seem helpful to me, but feel free to also open a discussion about whether they are helpful.

Hocuri added a commit to chatmail/core that referenced this issue Feb 7, 2024
Part of deltachat/deltachat-android#2909

For now, this is only sending a few basic metrics.
Hocuri added a commit that referenced this issue Feb 7, 2024
@r10s
Copy link
Member

r10s commented Jun 12, 2024

closing as there is no concrete actionable item left for android in this issue tracker.

if needed, reopen a new issue or reopen eg. at https://github.com/deltachat/interface/issues where some cross-platform things are discussed/tracked

@r10s r10s closed this as completed Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants