v1.8 Released

Improvements to search interface, ability to pass in parameters to extrators from the web interface, and more.

Posted by Luigi Marini on November 06, 2019 · 5 mins read

1.8.0 - 2019-11-06

Warning: This update adds a new permission for archiving files and adds it to the Admin role. Please make sure to run Clowder with the MONGOUPDATE flag set to update the database.


  • /api/search endpoint now returns JSON objects describing each result rather than just the ID. This endpoint has three new parameters - from, size, and page. The result JSON objects will also return pagination data such as next and previous page if Elasticsearch plugin is enabled and these parameters are used.
  • S3ByteStorageService now uses AWS TransferManager for saving bytes - uploads larger than ~1GB should now save more reliably.
  • /api/search endpoint now returns JSON objects describing each result rather than just the ID.
  • Clean up docker build. Use new buildkit to speed up builds. Store version/branch/git as environment variables in docker image so that they can be inspected at runtime with Docker.
  • Extractors are now in their own docker-compose file. Use Traefik for proxy. Use env file for setting options.
  • Utilize bulk get methods for resources widely across the application, including checking permissions for many resources at once. Several instances where checks for resource existence were being done multiple times (e.g. in a method and then in another method the first one calls) to reduce MongoDB query load. These bulk requests will also report any missing IDs in the requested list so developers can handle those appropriately if needed.
  • Removed mini icons for resource types on tiles. CATS-1031
  • Cleaned up pages to list and enable extractors. Added description of what the page does. Added links to extraction info pages. Removed authors from table.


  • Ability to pass runtime parameters to an extractor, with a UI form dynamically generated UI from extractor_info.json. CATS-1019
  • Infinite scrolling on search return pages.
  • Trigger archival process automatically based on when a file was last viewed/downloaded and the size of the file.
  • Script to check if mongodb/rabbitmq is up and running, used by Kubernetes Helm chart.
  • Queuing system that allows services such as Elasticsearch and RabbitMQ to store requested actions in MongoDB for handling asynchronously, allowing API calls to return as soon as the action is queued rather than waiting for the action to complete.
  • New extractors monitor docker image to monitor extraction queues. Monitor app includes UI that shows selected information about the extractors
  • New /api/thumbnails/:id endpoint to download a thumbnail image from ID found in search results.
  • New utility methods in services to retrieve multiple MongoDB resources in one query instead of iterating over a list.
  • Support for MongoDB 3.6 and below. This required the removal of aggregators which can result in operations taking a little longer. This is needed to support Clowder as a Kubernetes Helm chart. CATS-806
  • New Tree view as a tab in home page to navigate resources as a hiearchical tree (spaces, collections, datasets, folders and files). The tree is lazily loaded using a new endpoint api/tree/getChildrenOfNode.


  • Downloading metrics reports would fail due to timeout on large databases. Report CSVs are now streamed to the client as they are generated instead of being generated on the server and sent at the end.
  • Social accounts would not properly be added to a space after accepting an email invite to join it.
  • Fixed bug where extractors monitor will not print error, but just return 0 if queue is not found.
  • Pagination controls are now vertically aligned and use unescaped ampersands.
  • Changing the page size on dataset, collection, space listings would not properly update elements visible on the page. CATS-1030
  • Added a max of 100 status messages on the page listing all extractions. Before this trying to list all extractions in the system was causing the JVM to run out of memory. CATS-1032
  • Added padding to the top of the footer so that the superadmin notification does not cover buttons and the buttons at the end of forms are not too close to the footer and difficult to see.