Beyond the Scaffold: Evolving Software Projects with Yeoman Generators

December 29, 2023

Scaffolding is for different projects

Starting anew always brings a lot of excitement. Every new project gives a mix of confidence from making first steps you've trained a million times and anticipation that this time the results will be greater than ever before because you'll try some new answers to old challenges. At every new beginning, you're stronger than ever before.

It's not surprising that software developers like starting new applications, right?

Nowadays, we create more software packages than ever before. Microservices, microfrontends, microapps, monorepos and tools for them... You need a new module - you isolate it radically, define the communication interface and let it ripen.

Reasonably, the scaffolding tools are more important than ever before.

They help speed up project creation, ensure it's aligned with the best practices and swiftly bring you to the point of creating something that truly creates value - new automations.

Recently, I have been experimenting with extreme modularisation in my Node.js-based project. My objective is to keep it maintainable long term. In respect to it, I need to make each part of the ecosystem genuinely independent - every part should be ready for growth that doesn't impact other applications. To support creating microapplications in a sustainable manner, I need to ensure one thing - these numerous separate projects will be maintained well.

How quickly a project is created doesn't matter much in the long run if maintainability is compromised.

If my scaffolding tool doesn't help me keep settings consistent between dozens of packages long term, I don't need such a tool. I need a tool that will help me "make up" a project whenever I need it.

Blueprint

Vision

My goal was to find or create a tool that will not only take the burden of scaffolding a new project quickly and in a consistent manner (configuration, packages, package versions and so on) but that will also be capable of bringing a project up to advancing standards at any point in future.

If I use traditional approach. I run the generator that promotes current project standards in an empty folder and I get a project that meets the requirements today.

But the standards change over time. Let two months pass after the initial project creation, and a new fast TypeScript compiler reaches production readiness, TypeScript itself get a new version, license of choice has changed for your open source projects and a million other things get updated. And you change the standards because you must.

When running the standards ambassador (the scaffolding tool) in the project again, I want my project to get up to the height of the new standards (ideally, without breaking the application).

That would help take care of one of the most important aspects of why we need a tool - ensure consistency of standards between the projects. And do that continuously.

Struggles of choice

Solution

Platform

While creating a scaffolding engine from scratch might be a good idea, I decided to review ready solutions that have the following capabilities:

  • Generator scripts should be easy to create
  • Generator scripts should be easy to compose
  • At every step, there should be freedom to decide if and how an atomic change should be done
  • The tool should be aware of Node.js context and provide relevant utilities (adding dependencies, for example)

Less important but good to have capabilities are:

  • The tool should have built-in ability to collect user input during the generation stage
  • The tool should have ability to work with file templates

I have reviewed a few solutions:

I have to say the mentioned tools can provide a platform for flexible and composable generators but doing that in Plop and Hygen was quite an effort since the first one sets the generator parameters at the beginning of the execution (statically) and the second one is more template-oriented while I needed to put significant pieces of code in places to control what and how to do. I found it very challenging to do what I want the way I want with those two tools. Yeoman appeared much more permissive in this regard.

That's why I chose Yeoman.

I have implemented my own generator and published source code on GitHub and npmjs. Below, I'll cover the main challenges I had when implementing the logic I described above and how I approached them. Also, I'll mention what else would be great to have in a scaffolding tool like this.

Challenges

Challenges

Composability

One of the requirements was composability of generators and I achieved it by introducing two general types of generators:

  1. A "Feature" generator. They execute a set of atomic operations aiming to make up a particular aspect of the project.
  2. A "Makeup" generator. They execute multiple feature generators in a particular order.

For example, to ensure that there is an npm package initialised and the package name, author and license statements are populated is a "feature" (feature-initiate-npm-package). Initialising Jest for unit tests and configuring it (feature-initiate-jest) is also considered a "feature".

Running the first feature and the second feature is responsibility of a "makeup" generator.

The list of "feature" and "makeup" generators and what they do can be found in the repo README. All of them are supposed to be able to run on their own.

Atomic changes

To stay flexible about what and how to do, I employed the feature of Yeoman to define the separate changes in separate generator methods that are ran in a particular order (see "Adding your own functionality" in their docs). All changes are small, independent adjustments that collectively contribute to the overall project structure.

A good example is feature-initiate-typescript generator. For every mini-change, it has a separate safe-to-run-independently method. For example, enableEsModuleInterop gently processes the options and checks that one particular setting matches the expectations around it (it should be true for having ESM support in Jest). If the current value doesn't match, in this and only this case, we change it.

By creating these atomic independent flexible brush strokes, we can construct the project carefully, having flexibility to do something extra if needed.

A good example of something extra can be found in another generator - feature-initiate-npm-package. When creating a LICENSE file, we also check if there is any existing LICENSE.md or LICENSE.txt in the folder already. If there is some, an additional cleanup should be done. And with these atomic changes we can do these changes in an optimal way whenever is needed.

Profile data

My goal is to keep multiple projects generated with the tool consistent. That means, the decisions that have been made before should be taken into consideration at every run of the generator. Let's consider 3 different situations:

1. Setting project name

Every project has a name. And it's one of the first fields you fill in package.json. When you start a project and fill this field, why would you ever change it using a generator? Once user input is received, it can be stored in the project itself. And at every next run of the generator, if name is already stored in a project file, we can make an assumption - nothing to do here, it's named already.

In such cases, package.json seems to be the most natural place to store such information between the generator runs. So, the project configuration is considered generator configuration in such cases.

2. Choosing distribution type

Every project makes assumptions about the model of code distribution. Some are open source, some are proprietary and all of them should have right license information filled. But the way of specifying license and terms of use for a category of projects can change over time. Today, you use "LGPL-3.0" for your open source projects, tomorrow, you might decide that "MIT" is a better choice.

In such case, you can't judge by presence of a license file that the project is all set. The project needs remember the distribution type, not what license it meant at the time of running.

For such things, Yeoman provides Storage API that you can use from the code of your generator. Every configuration value is preserved in a special JSON file in the root of your generated project and this configuration can be read from and written to when the generator is running.

Good job, Yeoman.

3. Setting an author of a project

Another type of information that you might want to preserve between generator runs can, in fact, be reused between projects. For example, once you set "author.email" for one project, it might be used in other ones. There shouldn't be a need to ask for it again and again. It should be profile-level configuration.

Unfortunately, this feature is not well documented in Yeoman (I haven’t found any mention of it in the documentation. I found the hints of it in source code of Yeoman-Generator). Also, this global .yo-rc-global.json file created in the home directory that contains a profile JSON file saves prompts per-project. It’s not enough for my project, so I have created a separate global config storage. Check out file globalConfigurationUtilities.ts, if you’re curious to learn more.

Three types of configuration storages are enough for now

By choosing the right place for storing configuration, I can ensure that necessary information is asked as many times as many times it's strictly needed. That simplifies repetitive runs of the generator for maintenance purpose. I need it to be as straightforward as possible.

Futuristic blueprint

Conclusion

The full code of the generator I was experimenting with can be found on GitHub and npmjs. I'm glad that I have the generator as the dedicated place to store design standards for dozens of microapplications that I create. What is especially important in this very case, it is not a one-time activity but a continuous process of keeping projects standards consistent between my projects over long time.

Of course, for this particular generator, it's only beginning. A lot of things need to be considered and addressed in future.

Project types are different. Frontend is not like backend. A REST API microservice significantly differs from a GraphQL microservice. That will bring another challenge of composing general "feature" generators with very specific "feature" generators. That should be possible to do elegantly but that will impact the generator project structure.

Also, to be able to use the generator in different contexts (for your personal projects, for projects of your company, for projects you help a friend with and so on), I would like to introduce proper support of profiles that will allow to use varying for-all-projects presets.

And last but not least, I would be happy to make a generator capable of assisting with possible migrations. For example, a new standard version of a dependency might have breaking changes (some method or argument might be removed) and it's possible to identify the disaster scope automatically. Is this removed method used in any of the project files? Do updated methods receive the right number of arguments? That should be pretty feasible through leveraging text search and AST tools.

These are some challenges I foresee to encounter when working on this generator in future.

I would love to hear about your experiences with scaffolding in multiple projects, your challenges and your solutions. If you have an opinion about the challenges I have mentioned above and especially how they might be approached in an elegant way, do let me know through comments, email messages, GitHub issues and pull requests. The generator is designed to grow - it is using TypeScript for writing the generators and is covered the generators with TypeScript-written Jest tests.

I'll post a story about covering the generator with tests shortly. Thanks for reading.