Skip to content
This repository was archived by the owner on Feb 18, 2024. It is now read-only.

[DO NOT MERGE] Propose refined sample format #311

Closed
wants to merge 1 commit into from

Conversation

jmdobry
Copy link
Contributor

@jmdobry jmdobry commented Mar 5, 2019

Some of the notes for the reasoning behind this proposal are in googleapis/nodejs-dlp#154.

More detailed notes can be provided upon request.

This PR demonstrates a new sample format for LRO-based samples.

This PR is not intended to be merged, but to serve as a template so this samples in this repo can be replaced with auto-generated ones.

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Mar 5, 2019
@jmdobry jmdobry added status: blocked Resolving the issue is dependent on other work. do not merge Indicates a pull request not ready for merge, due to either quality or timing. labels Mar 5, 2019
const [response] = await operation.promise();

const transcription = response.results
.map(result => result.alternatives[0].transcript)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a convention to be followed by samples? It is a feature which generated samples need to support?

Specifically: translating a response collection into a newline delimited string by joining string fields of its elements (note: element may be deeply nested)

@jmdobry

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend:

for (const result of response.results) {
  // First alternative is the most probable result
  const alternative = result.alternatives[0];
  console.log(`Transcript: ${alternative.transcript}`);
}

^--- this is how I have the sample configured today

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which doesn't dismiss the question, however.

Is there any reason to support this type of map.join operation in samples?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually just refactored it to match the samples with print loops from googleapis/nodejs-dlp#154

It results in basically the same output, and makes the generator's job easier. PTAL

// Creates a client.
const client = new speech.SpeechClient();

async function transcribeAsync() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this function take no parameters?

As a JavaScript developer, would it not be common for me to read online documentation of functions with standard @param name {type} description

e.g. example from generated sample

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe the snippet on cloud site will include the function, right? My assumption was that we're very much optimizing these for consumption from cloud site, not the repo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The snippet on cloud site does include the function.

  1. We must to show the function for Node.
  • Otherwise we use await but the snippet does not show the outer async function context (we've had bug reports from users running our snippets by copy/pasting and then Node.js blows up because we're using await but they didn't copy/paste a wrapping async function.
  1. All languages** have standardized on showing method definitions in the embedded snippets. Especially important for languages where an import must happen above/outside of a method, so that we can show the import and the sample. Also has been very useful because it gives us the ability to use standard @param descriptions of each input parameter

** all but 1 or 2 which have not made clear decisions one way or the other afaict

#misc I have standardized on using (a) methods/functions (b) standard parameter documentation across all languages for the first version of the generator. It's very consistent. It looks good. Every developer is accustomed to reading the format. It includes the type of each param. In the near future we can consider changing this standard if absolutely necessary, but it works really, really, really well.

Copy link
Contributor Author

@jmdobry jmdobry Mar 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this sample, the user on cloud.google.com will see exactly:

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const filepath = 'path/to/audio.raw';
// const encoding = 'LINEAR16';
// const sampleRateHertz = 16000;
// const languageCode = 'en-US';

// Imports the Google Cloud Speech API client library.
const {SpeechClient} = require('@google-cloud/speech');

// Import other required libraries.
const fs = require('fs');
const util = require('util');
const readFile = util.promisify(fs.readFile);

// Creates a client.
const speech = new SpeechClient();

async function transcribeAsync() {
  // Get the bytes of the file.
  const fileBytes = Buffer.from(await readFile(filepath)).toString('base64');

  const request = {
    audio: {
      content: fileBytes,
    },
    config: {
      // Encoding of the audio file, e.g. 'LINEAR16'.
      encoding: encoding,
      // Sample rate in Hertz of the audio data sent.
      sampleRateHertz: sampleRateHertz,
      // BCP-47 language code, e.g. 'en-US'.
      languageCode: languageCode,
    },
  };

  // Detects speech in the audio file using a long-running operation. You can
  // wait for now, or get its results later.
  const [operation] = await speech.longRunningRecognize(request);

  // Wait for operation to complete.
  const [response] = await operation.promise();
  const transcriptions = response.results;
  console.log(`Transcriptions: ${transcriptions.length}`);
  for (const transcription of transcriptions) {
    console.log(transcription.alternatives[0].transcript);
  }
}

transcribeAsync();
  1. The input parameters are commented out at the top, so they're the very first thing the user sees.
  2. The visible function is only there to enable async/await
  3. The visible function is invoked at the very bottom of the sample
  4. There's no point passing the commented out variables to the function because they're already in scope and visible from within the function. Adding them to the function signature and the function invocation would clutter the sample.
  5. The sample is still runnable from the command-line by someone who cloned the repo because outside the region tags there is an additional wrapping function that does take all the input parameters received from the command-line.

) {
// [START speech_transcribe_async]
/**
* TODO(developer): Uncomment these variables before running the sample.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Uncomment these variables before running the sample. the language we always want to use?

What about the cases where samples run just fine, as-is. No modifications necessary?

We have set this as a goal, when possible, and have been quite successful. e.g. there is a GCS version of this sample where, rather than a local audio file, a path to an audio file in GCS is specified. We point to a public audio file in our common bucket for sample files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this approach, and would love to us trying to do it as much as humanly possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the guidance from our working group was to minimize how many variables the user must set (with the exception of project ID mentioned in other comments, but still up for discussion).

// const languageCode = 'en-US';

// Imports the Google Cloud Speech API client library.
const speech = require('@google-cloud/speech');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DLP canonical samples use an uppercase variable for the package and lowercase for the client.

Following these conventions, this sample would become:

const Speech = require('@google-cloud/speech');

// ...

const speech = Speech.SpeechClient();

Inconsistency to be resolved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My opinion: I strongly prefer the approach this sample takes with speech and client. I prefer this > DLP

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me throw a wrench into the works and suggest this:

const {SpeechClient} = require('@google-cloud/speech');
const speech = new SpeechClient();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer –

const client = new SpeechClient();
  1. Consistency across the board
  2. One less local variable (speech) in the scope of the sample to worry about having a naming collision with 😛 #selfish
  // const language = 'en-US';
  // const inputText = 'Your text to analyze';

  // Imports the Google Cloud Language API client library.
  const {LanguageClient} = require('@google-cloud/language');

  // Import other required libraries.
  const fs = require('fs');
  const util = require('util');
  const readFile = util.promisify(fs.readFile);

  // Creates a client.
  const language = new LanguageClient.SpeechClient();

  /* certain doom */

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! So it looks like we agree on the import syntax. I am however a little confused. Are you advocating for client or speech as the name of the client object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we simply to the camelCase version of the Constructor, we get the best of both. No collisions, and the variable has the word client in it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed such a change, PTAL

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

until you have a sample that needs two different clients from the same library

Ahhh... yes...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we simply to the camelCase version of the Constructor, we get the best of both. No collisions, and the variable has the word client in it.

I don't follow, what does that look like? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const speechClient = new SpeechClient();

it's pushed to this PR now


test(REGION_TAG, async t => {
const output = await runAsync(
`node ${REGION_TAG}.js`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤘

@JustinBeckwith
Copy link
Contributor

@jmdobry @beccasaurus are we ready to land this yet?

@jmdobry
Copy link
Contributor Author

jmdobry commented Apr 2, 2019

I think ultimately I will close this PR, document the sample guidelines, and then eventually the auto-gen samples will be able to replace everything in this repo in one go. As it is, my PR does not update every samples, I don't want to submit it and leave a set of inconsistent samples behind.

@jmdobry
Copy link
Contributor Author

jmdobry commented May 15, 2019

Canonical samples have been documented elsewhere. Thanks for review!

@jmdobry jmdobry closed this May 15, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cla: yes This human has signed the Contributor License Agreement. do not merge Indicates a pull request not ready for merge, due to either quality or timing. status: blocked Resolving the issue is dependent on other work.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants