How to build an app where broadcasts expire after 24 hours

Quite a few popular apps centered around user-generated content choose to remove content by design after a period of time. Snapchat are the pioneers in this area and Instagram chose to take the same path with their stories concept: stories are removed after 24 hours by default, but can optionally be saved permanently into highlight buckets on the user's profile. Live streaming apps Periscope, YouNow and Meerkat have experimented with various retention policies over the years, in part to save on storage costs.

Purging in a Bambuser-based app

If you focus on live-only streaming and never want to store content in your app based on Bambuser's broadcasting SDK:s, you can set the save on server flag to false (Android docs | iOS docs).

In case you do want to offer archived playback, but for a limited time only, you need to trigger the unpublishing and actual removal yourself, since Bambuser does not have a setting for automatic time-based purging of stored broadcasts. It is quite simple to set up such a workflow via the REST API with a few lines of custom code:

Node.js / AWS Lambda

The JavaScript snippet below is an example of how to paginate through the broadcast listing api with a time based filter and then delete all broadcasts older than 24 hours.

Node.js 8 or later is required to run the example code due to use of the async/await syntax, but the same approach should be viable in any language capable of doing HTTP requests.

The main function is exported under the name handler - this is what AWS Lambda expects by default. However, you should be able to use Google Cloud Functions, Heroku or any other Node.js hosting platform without any significant code changes. To simplify deployment, we include a slim REST helper function and we avoid using any npm dependencies.

You also need to ensure this function runs regularly on your server. Every hour, every 15 minutes or whatever you prefer, depending on your desired removal time granularity. You might hit API rate limits if you fire off many requests in rapid succession. The example script aborts if it hits a rate limit, then resumes removal gracefully on the next scheduled execution.

Most platforms have a way to run scheduled tasks. On AWS you can use CloudWatch events to accomplish this easily (see below).

/**
 * Lambda function that removes all broadcasts older than 24 hours
 * WARNING: delete is an irreversible operation! Make sure the API key
 * really belongs to the environment you intend to remove content in!
 */
const API_KEY = 'CHANGEME';

exports.handler = async (event) => {
  // Let's remove all broadcasts older than this unix timestamp (seconds, not milliseconds)
  const cutoffTS = Math.round((new Date()).getTime() / 1000) - 24 * 3600;

  // Fetch a (paginated) list of all broadcasts
  // https://bambuser.com/docs/api/get-broadcast-metadata/#get-metadata-for-multiple-broadcasts

  let removedCnt = 0;
  let body;
  do {
    const res = await bambuserAPIReq('GET', '/broadcasts?createdBefore=' + cutoffTS + '&limit=50' + (body && body.next ? '&after=' + body.next : ''));
    body = res.body;
    console.log(`Fetched page of ${body.results.length} broadcasts`);

    // We asked for a time-filtered set, but delete is destructive:
    // let's double check that the returned broadcasts are indeed old enough.
    // Also: Let's append the length when filtering, so that the broadcast
    // expires 24 hours after it ended, rather than 24 hours after it started.
    const expiredBroadcasts = body.results.filter(b => b.created && (b.created + (b.length ? b.length : 0) < cutoffTS));
    console.log(`Found ${expiredBroadcasts.length} broadcasts to delete in page`);

    for (let b of expiredBroadcasts) {
      // Delete broadcasts one at a time
      // https://bambuser.com/docs/api/removing-media/
      console.log(`Deleting broadcast ${b.id}`);
      await bambuserAPIReq('DELETE', '/broadcasts/' + b.id);
      removedCnt++;
    }

  // Continue paginating until the api doesn't have any more matches for us
  } while (body.results.length && body.next);

  return {
    statusCode: 200,
    body: JSON.stringify({removedCnt}),
  };
};

const bambuserAPIReq = function(method, path) {
  return new Promise((resolve, reject) => {
    const req = require('https').request({
      hostname: 'api.bambuser.com',
      port: 443,
      method,
      path,
      headers: {
        Accept: 'application/vnd.bambuser.v1+json',
        Authorization: 'Bearer ' + API_KEY,
      },
    }, res => {
      if (res.statusCode < 200 || res.statusCode > 299) {
         reject(new Error('HTTP error ' + res.statusCode));
      }
      let body = [];
      res.setEncoding('utf8');
      res.on('data', (chunk) => body.push(chunk));
      res.on('end', () => {
        body = body.join('');
        try {
          body = JSON.parse(body);
          console.log('JSON ' + res.statusCode + ' res');
        } catch (e) {
          console.log('Non-JSON ' + res.statusCode + ' res');
        }
        if (res.statusCode < 200 || res.statusCode > 299) {
          const e = new Error('HTTP error ' + res.statusCode);
          e.body = body;
          reject(e);
        } else {
          resolve({body})
        }
      });
    }).on('error', (err) => reject(err));
    req.end();
  })
};

In the AWS console, choose Lambda and click Create function.

Choose Author from scratch and give the function an appropriate name. Choose Node 8.x or later as runtime and choose a suitable execution role. The default role that Amazon creates for you if you haven't set one up should be fine.

In the function's Configuration section, in the code editor, drop the snippet above into index.js.

Make sure to replace REPLACEME near the top with your actual API key, which you can generate on the Developer page in Bambuser dashboard. You need to assign the read media and remove media scopes to your key.

Lambda kills your execution if it exceeds the configured timeout, and the default is probably not high enough. You can alter the timeout at the function's main page below the code editor. A minute or so should be more than enough for moderate amounts of content. Just like with rate limiting, it shouldn't be an issue if the script occasionally aborts: the next time it runs it will resume where it left off.

CloudWatch: scheduled trigger events

In the AWS console, choose CloudWatch and click Events > Rules in the lefthand sidebar. Then click Create rule.

As event source, choose Schedule and Fixed rate of 60 minutes, or perhaps 15 or even 5 minutes if you need actual removal to occur close to the calculated window. Set this higher than your execution timeout though, to avoid that multiple executions overlap each other.

Tip: It can be a good tradeoff to hide the broadcast at the exact calculated time in the presentation layer - i.e. in your own app or your own backend - and let actual purging happen with a more relaxed schedule.