Memory leak concerning strings

Hi,

I’ve just discovered a performance issue when modeling large BPMN models.

This is the memory footprint before editing few connections in the model:

memory_low

After editing some labels and adding/removing some connections, the memory usage for all strings increases from 2MB to 14MB (the number of strings is always about ~30000):

memory_high

Thanks in advance,
rene51

Interesting discovery. What would you like us to do? File a bug? Provide assistance?

If you’d like to report a bug, please build a reproducible test case for the behavior you’re seeing based on our starters.

I would like to report a bug :slight_smile:

I have used this example:
performance_test.bpmn (155.7 KB)

After deleting and rebuilding some connections between tasks, a huge increase in memory usage for strings can be observed:
memory_usage

Kind regards,
rene51

Thanks for the clarification @rene51.

I’ve simulated a quick modeling session (1:10 minutes) based on your diagram. During the session I recorded the JS Heap Allocation using the Chromium Dev Tools. My observations based on the recording (screenshot below) is that there is not much we need to worry about right now.

I’m happy to be proven wrong of course!

Understanding Memory Allocation

To better understand memory allocation of our library two things are important to know:

  • We record user operations using the CommandStack. This allows users to undo and redo what they did during a modeling session. To realistically detect leaks you’d need to CommandStack#clear() as part of your test run in order to clear the edit log.
  • A lot of things are happening under the scenes during modeling. Once in a while the browser performs garbage collection (GC). That, however is approximate and may not find all objects to be cleaned up in a single sweep (cf. MDN memory management for a good read on the basics). To do proper check you must perform manual GC at least once after you completed your test run.

Annotated Modeling Session

This screenshot shows the heap allocation during a simulated modeling session.

bitmap

I’m happy to take a look into the scenario you’ve created if you automate the modeling operations.

You may use the modeling API to do this or the bpmn-js-cli.

Thank you very much for your clarification!!
I will try to perform a manual garbage collection and check if the performance of my application will be improved!

For your understanding, I use your BPMN.io javascript library in a WinForms C# Application with the help of a browser control. Do you have any experience concerning this approach and possibly have already discovered performance problems?

Kind regards,
rene51

No experience, unfortunately.

Hello,

I’m now able to automatically generate the BPMN model out of our own database (XML file) with the help of the ‘cli’ interface and a recursive walk through our data base.
But now i face one problem, when i try to update labels…

First the update of the labels works fine, but after about 50 labels it takes longer and longer…
Perhaps, it is a problem with rendering…

Is it possible to deactivate the rendering process, so that I can update all labels and then start to render the whole diagram?
Has anyone experience with automatically generating models with the cli interface and resulting performance problems?

This is possible. You can deactivate rendering by intercepting all related events:

function intercept() {
  
  // return true to stop propagation
  return true;
});

eventBus.on([
  'render.shape',
  'render.connection',
  'shape.added',
  'shape.changed',
  'connection.added',
  'connection.changed'
], 100000, intercept);

If you want to enable rendering you can stop intercepting the related events and fire an event that will trigger a re-render of all elements at once:

eventBus.off([
  'render.shape',
  'render.connection',
  'shape.added',
  'shape.changed',
  'connection.added',
  'connection.changed'
], intercept);

const elements = elementRegistry.getAll();

const shapes = elementRegistry.filter(element => {
  return !element.waypoints && element.parent;
});

const connections = elementRegistry.filter(element => {
  return !!element.waypoints && element.parent;
});

shapes.forEach(shape => {
  const gfx = elementRegistry.getGraphics(shape);

  eventBus.fire('shape.added', {
    element: shape,
    gfx: gfx
  });
});

connections.forEach(connection => {
  const gfx = elementRegistry.getGraphics(connection);

  eventBus.fire('connection.added', {
    element: connection,
    gfx: gfx
  });
});

eventBus.fire('elements.changed', {
  elements: elements
});

This way you can avoid the actual rendering.

Thank you very much!

But unfortunately, I must find the bug somewhere else, the generation of the labels is still getting slower and slower…

What exactly are you doing?

Hello Philipp,

in my project all relevant data of a process is saved in a custom XML file. And out of this XML file I would like to automatically generate a BPMN model. This works fine with small processes, but with big workflows it takes a lot of time. I used the “bpmn-js-cli” library.

For example, this function generates any BPMN-Element:

function cliCreateElement(type, label, xPos, yPos) {
     var cli = window.cli;

     // create element
     var id = cli.create(
         type,           // type
         {
             x: xPos,    // X-Position
             y: yPos     // Y-Position
         },
         'Process_1'     // parent
     );

     // get the created element and set a label if needed
     var createdElement = cli.element(id);
     if (label !== null) {
         // this line takes a lot of time, depending on how many elements are already created...
         cli.setLabel(createdElement, label);
     }

     // save the "di.id" and "element" pair for the generation of connections
     cliElementsDictionary.set(createdElement.businessObject.di.id, createdElement);

     // clear the command stack
     var CommandStack = modeler.get('commandStack');
     CommandStack.clear();

     // return the created element
     return createdElement;
}

setLabel will trigger a calculation of the bounding box of the label shape. This is a costly operation and can take quite a lot of time. Unfortunately you have to do it at some point.