Tailing the CosmosDB Change Feed

This will hopefully be a short post about how to listen to changes from your CosmosDB SQL API using the Node SDK. For some reasons, it reminds me of the oplog tailing mechanism in MongoDB that Meteor utilized in the past to achieve real-time, hence this article's title.

I do not have much time right now, but let us talk about a bit of my use case for this. I am currently working on an IoT project that uses several Azure technologies as its infrastructure. As you might have guessed, it pipes data from the IoT sensors into Azure Event Hubs. The data then gets ingested in Azure Databricks; ultimately ending up in an intermediate data warehouse (for dashboards and further analysis) and a NoSQL application data store (coming up next).

This NoSQL application data store is the CosmosDB instance that we are going to tail. The objective was to push changes from this data store to a mobile application in real-time or at least real-time (vs long polling or querying in set intervals). To make the long story short, I ended up tailing the CosmosDB's Change Feed in a GraphQL service application to make it easier for the client application to implement a PubSub system. More about this in an older post that I published early this year.

Before we dig into the code, let me just say that we are not going to go through how to initialize your CosmosDB collection or how to write a facade / repository class in NodeJS. We will go straight to the Change Feed part but first, make sure you have the correct Node SDK in your project:

npm install --save @azure/cosmos

Once you have set up your CosmosDB boilerplate code in Node, you can access the change feed iterator using the code below:

const changeFeed = container.items.readChangeFeed(partitionKey, options);

If you need to brush up on iterators in JavaScript, you can check this post that I have written last year).

And that's it! You now have a change feed iterable where you can listen to changes from CosmosDB, but let us see what options we have in doing so. The Node SDK currently gives us four ways of listening to the change feed and they are:

1. Start from now

const options = {};

2. Start from a continuation

const { headers } = await container.items.create({ ... });
const lsn = headers["lsn"];
const options = { continuation: lsn };
// I have not used this specific method yet, this example is from here.

3. Start from a specific point in time

const options = { startTime: new Date() }; // any Date object will do

4. Start from the beginning

const options = { startFromBeginning: true };

As all of these are self-explanatory, we will go ahead and use our changeFeed iterator. To retrieve the next value, we can use await changeFeed.executeNext() or we can loop through the next values like this:

while (changeFeed.hasMoreResults) {
  const { result } = await changeFeed.executeNext();
  // do what you want with the result / changes
}

Reading the source code of the Node SDK revealead that it is also exposing a real generator (the function signature being public async *getAsyncIterator(): AsyncIterable<ChangeFeedResponse<Array<T & Resource>>>). This would have allowed a more elegant for of construct, but unfortunately I bumped into a few issues regarding Symbols when I tried it. If you have used it in the past, please feel free to share in the comments!

That will be all for now, and I hope you learned something in this post.

Real-time GraphQL Subscriptions Part 1: Server

In this post, we will not go through GraphQL from the ground up. My assumption is that you are here to learn how to implement real-time through the use of the graphql-subscription package and that you already have a basic understanding of GraphQL types, queries and mutations.

We are going to use Apollo for this tutorial. Apollo was made by the same guys who made Meteor and is a bit opinionated but is also arguably one of the most popular full-featured GraphQL libraries around. We will also use React and create-react-app to bootstrap our client application on the second part of this tutorial. That being said, some knowledge of higher order components is also assumed (in Part 2).

Server Boilerplate

Let's start outlining our backend. Initialize a Node project by issuing npm init on your preferred folder, and then install dependencies like so:

npm i --save express body-parser cors graphql graphql-tools apollo-server-express

Next, create the three files that we will use for this short tutorial:

touch index.js resolvers.js schema.js

We will then define a type, a root query and a mutation that we will use for our subscription:

const { makeExecutableSchema } = require('graphql-tools');
const resolvers = require('./resolvers');

const typeDefs = `
  type Message {
    message: String
  }

  type Query {
    getMessages: [Message]
  }

  type Mutation {
    addMessage(message: String!): [Message]
  }

  schema {
    query: Query
    mutation: Mutation
  }
`;

module.exports = makeExecutableSchema({ typeDefs, resolvers });

Okay, so at this point, we have the schema for a GraphQL server that allows you to send a mutation to add a Message, and a query that allows you fetch all messages in the server. Let's implement resolvers.js so we can start using our schema:

const messages = [];

const resolvers = {
  Query: {
    getMessages(parentValue, params) {
      return messages;
    }
  },
  Mutation: {
    addMessage(parentValue, { message }) {
      messages.push({ message });
      return messages;
    }
  }
};

module.exports = resolvers;

Oh shoot. We have defined a schema and the functions that will resolve their return values, but we have not set our server up. At least not yet. We are going to use express and apollo-server-express to serve our GraphQL implementation through HTTP:

const express = require('express');
const bodyParser = require('body-parser');
const cors = require('cors');
const { graphqlExpress, graphiqlExpress } = require('apollo-server-express');
const { createServer } = require('http');

const schema = require('./schema');

const app = express();

app.use(bodyParser.json());
app.use(bodyParser.urlencoded({ extended: true }));

app.use(cors());
app.use(
  '/graphql',
  graphqlExpress({
    schema
  })
);

app.use(
  '/graphiql',
  graphiqlExpress({
    endpointURL: '/graphql'
  })
);

const PORT = process.env.PORT || 3030;

const server = createServer(app);
server.listen(PORT, () => {
  console.log(`Server now running at port ${PORT}`);
});

We can now have a working GraphQL server running on http://localhost:3030/graphql by issuing node index.js. Since we have configured the interactive Graphiql as well, we can explore our schema, and issue some sample queries and mutations on http://localhost:3030/graphiql:

Adding real-time through Subscriptions and PubSub

Server Configuration

Our simple GraphQL server is running. That means we can now proceed to the interesting part: implementing real-time through Apollo PubSub. As with all modern real-time frameworks, the implementation is often done on top of WebSockets. We need to install additional dependencies to make use of this transport layer:

npm i --save graphql-subscriptions subscriptions-transport-ws

We then need to make use of these libraries to enable WebSockets support on index.js:

const { execute, subscribe } = require('graphql');
const { SubscriptionServer } = require('subscriptions-transport-ws');

. . .

const server = createServer(app);
server.listen(PORT, () => {
    console.log(`Server now running at port ${PORT}`);
    new SubscriptionServer(
        {
            execute,
            subscribe,
            schema
        },
        {
            server,
            path: '/subscriptions'
        }
    );
});

Let's modify our /graphiql endpoint as well to make use of our new transport layer, so we can demonstrate that this is working through Graphiql once we are done:

app.use(
    '/graphiql',
    graphiqlExpress({
        endpointURL: '/graphql',
        subscriptionsEndpoint: 'ws://localhost:3030/subscriptions'
    })
);

That's it for the server setup! Let's proceed on fleshing out the subscription implementation.

Defining Subscriptions

In GraphQL, a subscription is just a type, pretty much like query and mutation. Go ahead and define our subscription on our schema.js:

  const typeDefs = `
  
  . . .
  
  type Subscription {
    newMessageAdded: Message
  }

  schema {
    query: Query
    mutation: Mutation
    subscription: Subscription
  }
`;

We have just defined our first subscription. It will allow applications or clients to subscribe and receive updates whenever new messages are added (through a mutation). Just to make sure everything is working correctly, visit Documentation Explorer on Graphiql and you should now be able to see Subscription and newMessageAdded:

If there are no errors and you can see the definition above, then we are ready to make this work by, you guessed it, implementing the resolver function for newMessageAdded.

Implementing the Subscription and Publishing Messages

With the transport configuration and the type definitions done, the only thing we need to do now is to implement newMessageAdded and the actual message publication. The flow will be like this:

1. A client will subscribe to `newMessageAdded`
2. Every time our `addMessage` mutation is queried, we will publish a message to `newMessageAdded`, using the new `message` as the payload.

We need to tweak our resolvers.js to import helpers from graphql-subscriptions. We will use them to implement our newMessageAdded subscription query:

const { PubSub, withFilter } = require('graphql-subscriptions');
const pubsub = new PubSub();

. . .

const resolvers = {
  Query: {
    . . .
  },
  Mutation: {
    . . .
  },
  Subscription: {
    newMessageAdded: {
      subscribe: withFilter(
        () => pubsub.asyncIterator('newMessageAdded'),
        (params, variables) => true
      )
    }
  }
};

module.exports = resolvers;

We just implemented our first subscription query! Every time our server publishes a message to newMessageAdded, clients that are subscribed will get the published payload.

As an aside, the helper function withFilter is not actually required nor used in our example here (just subscribe: () => pubsub.asyncIterator('newMessageAdded') will do for this tutorial), but I figured that this will be helpful if you want to try something useful with this whole pubsub ordeal, like say, a classic chat app.
The second function that you pass as an argument to withFilter will allow you to filter out the subscribers who will receive the message. This is done by using the field in the actual payload that is about to get published (params) and the GraphQL query variables from the subscription (variables). All you need to do is to return a truthy value if you want it sent to this particular subscriber. It will look roughly similar to this: return params.receiverId === variables.userId. Of course, that is assuming that a query variable called userId was sent along with the subscription.

Since we do not have an application that will subscribe to our server yet, why don't we try this out with Graphiql?

If you can see the same message above, great! Everything is working awesome. But if we do not publish anything anywhere on our server, nothing will happen. Yep, we are about to do just that.

In fact, we just need to add one line to our addMessage resolver:

  Mutation: {
    addMessage(parentValue, { message }) {
      messages.push({ message });
      
      // blame prettier for not making this a one-liner 
      pubsub.publish('newMessageAdded', {
        newMessageAdded: { message }
      });
      
      return messages;
    }
  }

We can now test this using Graphiql on two browser windows. The first browser will be the subscriber, and the second one will send the mutations:

As soon as you send a addMessage mutation on the second browser, the first browser receives the message, and displays it instantly! How cool is that? Let's wrap up what we learned in this short tutorial.

Wrap up

In this tutorial, we learned how to set up subscriptions and publish message across subscribers using graphql-subscriptions. On the next part of this tutorial, we will use apollo-client with react-apollo to see how this will work with a real application as the subscriber.

The complete source code for this tutorial can be found here

If you encountered any errors or have any questions about this, let me know in the comments section below!

Installing MongoDB 3.4 (with SSL) on Ubuntu 16.04 (MS Azure)

Hey everyone. I know it has been a while since I wrote something. I have been busy with multiple, large scale projects during the past few months, so I was almost always too tired at the end of the day to compose a new entry. I also had to relocate; I think the adjustment phase also took a lot of my time and energy. Anyway, what I am going to try to do now is to write short, straight to the point tutorials about how to do specific tasks (as opposed to going into more detailed, wordy posts). I will still write the elaborate ones, but I will be focusing on consistency for now. I have been working on a lot of interesting problems and relevant technologies at work, and I just feel guilty that I do not have enough strength left at the end of the day to document them all.

Let us start with this simple topic just to get back to the habit of writing publicly. I have been configuring Linux VMs for a while now, but I have not really written anything about it, aside from my series of Raspberry Pi posts. Also, it is my first time to work with the Azure platform, so I thought that it might be interesting to write about this today.

This tutorial will assume that the Ubuntu 16.04 VM is already running and you can SSH properly into the box with a sudoer account.

The Basics: Installing MongoDB

You can read about the official steps here. If you prefer looking at just one post to copy and paste code in sequence, I will still provide the instructions below.

Add the MongoDB public key

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 0C49F3730359A14518585931BC711F9BA15703C6

Add MongoDB to apt's sources list

echo "deb [ arch=amd64,arm64 ] http://repo.mongodb.org/apt/ubuntu xenial/mongodb-org/3.4 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.4.list

Update apt repository and install MongoDB

sudo apt-get update && sudo apt-get install -y mongodb-org

Run and check if MongoDB is running properly

sudo service mongod start
tail -f /var/log/mongodb/mongod.log

If everything went well, you should see something like this:

2017-10-04T01:18:51.854+0000 I NETWORK [thread1] waiting for connections on port 27017

If so, let's continue with the next steps!

Create a root user

I will not get into the details of how to create and manage MongoDB databases and collections here, but let us go into the processes of creating a root user so we manage our database installation remotely through this user.

Connect to MongoDB CLI

mongo

Use the admin database

use admin

Create admin user

db.createUser(
    {
      user: "superadmin",
      pwd: "password123",
      roles: [ "root" ]
    }
)

SSL and some network-related configuration

Now that we have MongoDB installed and running, we need to make some changes with the mongod.conf file to enable SSL and to make our MongoDB installation accessible on our VM's public IP and chosen port.

SSL Certificates

Creating a self-signed certificate

If you already have a certificate or you just bought one for your database for production use, feel free to skip this step. I am just adding this for people who are still experimenting and want SSL enabled from the start. More information regarding this can be found here.

This self-signed certificate will be valid for one year.

sudo openssl req -newkey rsa:2048 -new -x509 -days 365 -nodes -out mongodb-cert.crt -keyout mongodb-cert.key

Create .pem certificate

This .pem certificate is the one that we will use on our mongod.conf configuration file. This command will save it on your home directory (/home/<username>/mongodb.pem or ~/mongodb.pem).

cat mongodb-cert.key mongodb-cert.crt > ~/mongodb.pem

MongoDB Configuration

Now that we have our self-signed certificate and admin user ready, we can go ahead and tweak our MongoDB configuration file to bind our IP, change the port our database will use (if you want to), enable SSL and to enable authorization.

I use vim whenever I am dealing with config files via SSH; you can use your favorite text editor for this one.

sudo vim /etc/mongod.conf

Make sure to change the following lines to look like this:

net:
  port: 27017
  bindIp: 0.0.0.0
  ssl:
    mode: requireSSL
    PEMKeyFile: /home/<username>/mongodb.pem

security:
  authorization: enabled

Restart the MongoDB service:

sudo service mongod restart

If we go ahead and print the MongoDB logs like we did earlier, we should be able to see something that looks like this (notice that there's an SSL now):

2017-10-04T01:18:51.854+0000 I NETWORK [thread1] waiting for connections on port 27017 ssl

If you got that, it means that everything is working fine. We just need to add one more command to make sure that our MongoDB service will restart across VM reboots. systemctl will take care of that for us:

sudo systemctl enable mongod.service

Azure Firewall

Now, if you try to connect to your database using your favorite MongoDB database viewer or by using the Mongo CLI on your local machine, you might notice that you will not be able connect. That's because we need to add an Inbound security rule on the Azure portal first.

Once on the Dashboard, click on All Resources.
Azure Portal Dashboard

Click on the Network Security Group associated with your VM.

Azure Portal Inbound Security Rules

From here, you can see a summary of all the security rules you have for your virtual network. Click on Inbound security rules under Settings on the left pane.

Azure Portal Network Security Group Settings

Click Add. You should be able to see a form with a lot of fields. We are used MongoDB's default port, so we can just click on Basic at the top so we can select from a list of preset ports.

Basic Inbound security rules form

Just click on OK, and we are done! You can start connecting to your MongoDB installation using your tool of choice.

Implementing Token-Based Authentication With jwt-simple

On this post, we will talk about JSON Web Tokens, most commonly known by its acronym JWT. If you have done any web development work for the last few years, you must have heard of it, or even used a package that uses JWT to implement a token-based authentication mechanism under the hood.

We will examine what a JWT is and describe what comprises a valid token. Next, we will implement basic authentication using Node/Express and the jwt-simple package.

What is JWT?

According to the comprehensive Introduction to JSON Web Tokens:

JSON Web Token (JWT) is an open standard (RFC 7519) that defines a compact and self-contained way for securely transmitting information between parties as a JSON object. This information can be verified and trusted because it is digitally signed. JWTs can be signed using a secret (with the HMAC algorithm) or a public/private key pair using RSA.

JWT is said to be compact because it uses JSON which is pretty much how every web application these days pass data across consumers and other APIs. That means that a JWT can be easily passed around as a query parameter, through a POST request, or through request headers. Being self-contained adds up to the portability because it means that it can contain the needed information in the token itself. We will see this in practice in our small Express application.

Anatomy of JSON Web Tokens

A JSON Web Token is made up of three parts that are separated by dots. The first two parts are called Header and Payload, respectively. Both of them are Base64 encoded JSON objects that contain several information that we are going to briefly discuss below.

The Header object contains the type of the token and the encryption algorithm used. Since we are going to create a basic authentication mechanism on an Express app, the type is JWT and the encryption will be a keyed-hash message authentication code (HMAC). Since we will use a package which will simplify the encoding and decoding of our tokens, there is no need to set this explicitly and we will stick with the defaults which is HMAC SHA256.

The Payload contains what the specification refers to as claims. They are information that can be attached to the token for identification or verification purposes. Claims are further categorized as Registered ClaimsPublic Claims and Private Claims. On our example app, we will use Registered Claims to identify our application as the Issuer of the token and to set its expiry. We will also make use of the user's name and their password as Public Claims.

Now that we have discussed the first and the second part of a JWT, it is now time for the third one, which is called the Signature. Once we have the Header and the Payload properly encoded as a Base64 strings, they need to be concatenated with a dot, and then hashed with the app secret. This process will produce the token's signature. The secret can be any string, but as the name suggests, keep it secret because it can be used to decode your token's Header and Payload.

Here's an example token:

eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJleHAiOjE0ODk5OTEyNjI3NTIsImlzcyI6IkpvaG4gQ3Jpc29zdG9tbyIsIm5hbWUiOiJjcmlzb3N0b21vIiwiZW1haWwiOiJjcmlzb3N0b21vQGpvaG4uY29tIn0._CP8KU_AX4XNJKyxD561LTiFbY0HcPFKRgI1AztGMsI

Try to notice the dots that separate the three parts of the token. To wrap this section up and as a review, the first two parts are the Base64 encoded JSON objects that contains information about the user and our application. The third part is hashed version of the first two parts with the application key used as the hash key.


Application Demo

It is now time for the application demo. At this point, we already have a good grasp of what a JSON Web Token is and its parts. We are now ready to put this into practice by creating a demo application to solidify the concepts that we have learned. Before we start, a word of precaution:

The example app that we will build in this section will be for the sole purpose of understanding how JWT can be used to implement a barebones token-based authentication. Please do not use this example in production. There are better packages out there that uses jwt-simple under the hood and makes this process foolproof.

Dependencies

Creating the user store and the token store

Since this is a fairly small project, we will not use any real databases. Instead, the users will be hard coded in an array, as well as the tokens. We will create two files to implement these functionalities in this section.

USERS.JS

const users = [
  { _id: 1, name: "john", email: "john@crisostomo.com", password: "john12345" },
  { _id: 2, name: "crisostomo", email: "crisostomo@john.com", password: "crisostomo12345" },
];

function validateUser(username, password) {
  const user = users.find((user) => {
    return user.name === username && user.password === password;
  });

  return user;
}

module.exports = { validateUser };

TOKENS.JS

const tokens = [];

module.exports = {
  add: function(token, payload) {
    tokens[token] = payload;
  },

  isValid: function(token) {
    if (!tokens[token]) {
      return false; 
    }

    if (tokens[token].exp <= new Date()) {
      const index = tokens.indexOf(token);
      tokens.splice(index, 1);
      return false;
    } else {
      return true;
    }
  }
}

On our users.js file, we exposed a convenience method to let us easily validate a user by searching through our users array. Our token.js file allows us to add a token with the associated payload. It also has a method that can check a token's validity.

Creating our application

This is where we create our application. Our app will have two entry points: one for accessing a restricted route, and another one where we can obtain tokens for registered users. The endpoint for these functionalities are /secretInfo and /token.

On a high level, we can obtain a valid token if we send a POST request to the /token endpoint with valid user credentials. This token can then be used to access the information at /secretInfo.

The first thing that we need to do is to require the dependencies mentioned above, and set the server to run at port 8080:

const express = require('express');
const bodyParser = require('body-parser');
const jwt = require('jwt-simple');
const moment = require('moment');
const users = require('./users');
const tokens = require('./tokens');

const app = express();
app.use(bodyParser.json());

const jwtAttributes = {
  SECRET: 'this_will_be_used_for_hashing_signature',
  ISSUER: 'John Crisostomo', 
  HEADER: 'x-jc-token', 
  EXPIRY: 120,
};

app.listen(8080);

console.log('JWT Example is now listening on :8080');

This sets all our dependencies and imports our user and token stores. We also declared an object called jwtAttributes. This object contains the claims that will be used for our token, as well as some other attributes like the app secret and header key. At this point, this server will run but will not do anything because we have not implemented any routes or endpoints.

Let us start implementing the /token endpoint.

// AUTH MIDDLEWARE FOR /token ENDPOINT
const auth = function (req, res) {
  const { EXPIRY, ISSUER, SECRET } = jwtAttributes;

  if (req.body) {
    const user = users.validateUser(req.body.name, req.body.password);
    if (user) {
      const expires = moment().add(EXPIRY, 'seconds')
        .valueOf();
      
      const payload = {
        exp: expires,
        iss: ISSUER,
        name: user.name,
        email: user.email, 
      };

      const token = jwt.encode(payload, SECRET);

      tokens.add(token, payload);

      res.json({ token });
    } else {
      res.sendStatus(401);
    }
  } else {
    res.sendStatus(401);
  }
};

app.post('/token', auth, (req, res) => {
  res.send('token');
});

Before we set up our route for the /token endpoint, we created the authentication middleware. It will check if the request has a body and it will try to validate if a user with the matching password is found on our user store. This middleware could make use of more validation, but I am keeping it simple to make our example less cluttered.

If a user is found, it sets the token's expiry with the help of moment and the set amount of time defined in our jwtAttributes object. Next, we proceed in constructing our payload. Notice that we have two registered claims exp and iss, which stands for expiry and issuer, and two public claims which are the user's name and email.

Finally, the encode method of the jwt-simple package abstracts the process of encoding our payload. It generates our token by concatenating the header and hashing them with the app secret. If the request's body is invalid or if the user/password combo is not found on our store, we return a 401 Unauthorized response. The same goes for sending blank requests, too.

Time for the /secretInfo endpoint.

// VALIDATE MIDDLEWARE FOR /secretInfo
const validate = function (req, res, next) {
  const { HEADER, SECRET } = jwtAttributes;

  const token = req.headers[HEADER];

  if (!token) {
    res.statusMessage = 'Unauthorized: Token not found';
    res.sendStatus('401').end();
  } else {
    try {
      const decodedToken = jwt.decode(token, SECRET);
    } catch(e) {
      res.statusMessage = 'Unauthorized: Invalid token';
      res.sendStatus('401');
      return;
    }
    
    if (!tokens.isValid(token)) {
      res.statusMessage = 'Unauthorized : Token is either invalid or expired';
      res.sendStatus('401');
      return;
    }
    next(); 
  }
};

app.get('/secretInfo', validate, (req, res) => {
  res.send('The secret of life is 42.');
});

Similar to our /token endpoint, we start by implementing our validate middleware. It checks if a token exists in the header, then jwt-simple decodes the token. It gets validated through our token store's method. If the token is found and is not yet expired, we call on the next handler, and the secret message is sent. Otherwise, we send our 401 Unauthorized as the response.

Now that we have finished implementing both endpoints, we can proceed in testing them with Postman.

Testing our app with Postman

Postman is a nifty Chrome app that can be used to test REST APIs. You can get Postman here.

If we send a GET request directly to /secretInfo, we will get a status code of 401, along with an Unauthorized message:

Likewise, sending an incorrect user credentials will give us the same response:

Providing the /token endpoint with a valid payload (a valid JSON with correct user credentials) will provide us a token that is bound to expire in two minutes:

We can then use the token by sending another GET request to /secretInfoendpoint, by including the token through the x-jc-token header (we specified this key on the jwtAttributes object):

Wrap up

That's it! We have successfully implemented a basic token-based authentication on Express by using jwt-simple. Equipped with this knowledge, we can now understand how popular authentication packages uses JWT under the hood. That makes us more capable to troubleshoot JWT authentication problems or even contribute to these packages. If you want to clone the files in this mini-tutorial, you can get them on this Github repository. If you are interested in learning more about JWT, you can get a free eBook here.


Basic Generators in JavaScript

I have been watching a movie last night when my mind spun on a different thread and remembered a JavaScript language feature that have existed for some time now, but I have never had the chance to use. At least, directly.

We do bleeding edge JavaScript at the office. That means we have all these new language features at our disposal as early as possible through the use of Babel. We write JavaScript code using the newest language specification (ECMAScript 6/7) and our code gets transpiled into ECMAScript 5. We have been using all the nifty features such as importasync/awaitspread/rest operators and destructuring as early as last year. These are just the new ES6 features that I can think of off the top, maybe because they are the most practical ones.

There is one feature, however, that can be really powerful but I have not really been able to leverage. They are generators. Prior to V8 v5.5 and Node v7.6.0, Babel's async/await and other asynchronous libraries around has been using generatorsunder the hood to implement this feature.

But what are generators? According to the venerable MDN page:

A generator is a special type of function that works as a factory for iterators. A function becomes a generator if it contains one or more yield expressions and if it uses the function* syntax.

MDN's definition is clear and straightforward, but let me rephrase it from what I have understood. Aside from producing iterables, think of a generator as a function that you can play and pause. This characteristic enables it to implement asynchronous programming, and when used with promises, you can come up with all sorts of things- including your own async library if you want to make one for learning purposes.

Let's dig into some basic code examples to solidify our understanding of generators:

function* counter() {
  for (let i = 0; i < 5; i++) {
    yield i
  }
}

This function was declared using function* and has a yield inside the function body, so this must be a generator. When we invoke it and assign the result to a variable like so, let c = counter(), we get back an iterable object that we can use to iterate over the values of i. An iterable object in JavaScript must have a next()method. This method returns an object that contains a value and a done property. Let's see that in action:

/***************************************************
  Using next() to step through the values explicitly
****************************************************/
let c1 = counter();

console.log(c1.next().value);
// 1
console.log(c1.next().value);
// 2
console.log(c1.next().value);
// 3
console.log(c1.next().value);
// 4
console.log(c1.next().value);
// 5

/***************************************************
  Using a for-of loop
****************************************************/
let c2 = counter();

for (const num of c) {
  console.log(c);
}

// 1
// 2
// 3
// 4
// 5

/***************************************************
  Using the done property explicitly
****************************************************/
let c3 = counter();

let i = c3.next();

while (!i.done) {
  console.log(i.value);
  i = c3.next();
}

// 1
// 2
// 3
// 4
// 5

We went through three different ways on how to iterate over the iterable that was returned by our counter generator. On the first example, we manually stepped through the iterator by using next(). We know that next() returns an object with a valueand a done property, and so we were able to chain .value every time we log the iteration to the console. This shows us one of the concepts that we have discussed earlier: we were able to play and pause the generator's execution by using the next() method. Another interesting thing is that it remembers its internal statethrough its iterations.

It works this way: the generator function stops immediately at every yield statement, and passes the value on its right to the object being returned by next(). We used a loop on our example, and by doing so, the loop gets suspended every time it encounters a yield statement.

Another thing worth knowing is that we can alter the generator's internal state from outside the generator by passing in an argument to next():

function* counter (limit) {
  for (let i = 1; i <= limit; i++) {
    let j = yield i;
    if (j) limit = j;
  }
}

/***************************************************
  Passing a value to next() to alter internal state
****************************************************/
const c1 = counter(2)

console.log(c1.next().value); // 1
console.log(c1.next().value); // 2
console.log(c1.next().value); // undefined

/***************************************************
  Passing a value to next() to alter internal state
****************************************************/
const a2 = counter(2)

console.log(c2.next().value); // 1
console.log(c2.next().value); // 2
console.log(c2.next(5).value); // 3
console.log(c2.next().value); // 4
console.log(c2.next().value); // 5

The example above is yet another contrived modification to our earlier example. This counter generator accepts an argument as the limit to the number of values it can generate. It has the same loop as the above example, except that the control is now dependent on the limit parameter that was passed to it.

Inside the loop body, we have declared a variable j that is being assigned to the value of yield. This expression is being followed by another control structure: an if statement that checks the value of j. The value of j will replace the value of limit if it has a truthy value.

As I have mentioned prior to showing the examples, we can control the internal state of generators by passing an argument to the next() method. This argument will become the value of yield inside the generator, and as such we can assign it to control its behavior.

This can be seen above where we both declared a generator with an initial limit of 2 values. On the first one, we did not pass an argument to next() and so we were only able to iterate through two values. On the second example, we did the same thing, but we passed in a value of 5 as an argument to next(). This altered the generator's internal limit from two to five values, enabling us to get three more values out of it.



On this post, we have learned about the basics of ES6's generators. We went through the basic implementation and usage through some simple examples. We found out that generator functions are declared using the function* keyword, and contains at least one yield statement/expression. We also found out that a generator produces and iterable with a next() method. Since this post is getting long, I have decided to split this post into two. On my next post, we will explore how to implement basic async/await functionality through the use of generators and promises.