Data Warehousing, BI and Data Science

25 September 2018

Data Warehouse for Asset Management

Filed under: Data Warehousing — Vincent Rainardi @ 6:50 am

A. Brief Summary

A.1. Fact Tables

The main fact tables in an asset management warehouse are position, performance, risk and transaction. In the next 4 paragraphs I’ll explain what they are briefly. Later on in the article I will be explaining them in more detail.

The position fact table (aka valuation fact table) contains the daily holdings and benchmark constituents for every portfolio, both security/instrument positions and cash positions. The main issue with this fact table is the matching between the holding/portfolio positions and benchmark positions. We have the option of not attaching the benchmark, which makes it flexible particularly for portfolios having more than one benchmarks. The position fact table is mainly used to provide portfolio breakdown, and top N holdings. The other issue is lookthrough.

The performance fact table contains the performance returns of every portfolios compared with their benchmarks, for many different periods e.g. 3M, 6M, 1Y, 3Y. There is also another type of performance fact table, which contains the performance attribution to different factors such as yield curve, sector allocation and security selection. Performance can also be allocated to different sectors, currencies and asset classes, each compared to the benchmark. Please see my article here.

Because of the non additive nature of risk numbers, there are many risk fact tables. They mainly hold the risk sensitivities such as interest rate sensitivities (PV01, DV01), inflation sensitivities (IE01), credit rate sensitivities (CR01) and exchange rate sensitivities. The risk fact table also stores the VAR data (value at risk) at portfolio level and at “carved-up” level (for example, industry sector, currency and asset class)

Transaction fact table stores every buy, sell and corporate action in every portfolio, including the fees and charges. The transaction status, which changes at every stage on different dates, can be stored on an accumulated snapshot fact table. Whereas the trades data can be stored on a transaction fact table.

Other business areas requiring fact tables are:

  • Fund administration e.g. redemptions, subscriptions, distribution payments
  • Liquidity management e.g. cash ladder (projected cash)
  • Collateral management and counterparty exposure
  • ESG data (Environment, Social, Governance)

A.2. Dimensions

In an asset/investment management DW, the following dimensions are commonly found:

  1. Security dimension (aka Instrument dimension or Asset dimension) which contains 3 types of securities: held securities, benchmark securities and watchlist securities.
    Note: cash is a financial instrument but not a security. Only equity, fixed income and money market are financial securities, including derivatives (refer to The Securities Exchange Act, section 3 part 10)
  2. Issuer dimension (aka company dimension) containing a list of public and private companies who issue shares or bonds, had an OTC trade/deal with us, or has ESG data.
  3. Client dimension (aka customer dimension) containing a list of clients from the CMS such as Salesforce, with their usual attributes such as client type, Mifid category, industry sector, and address.
  4. Portfolio dimension (aka account dimension, product dimension or fund dimension) which contains a list of open funds, segregated account/mandate and trust funds that the company runs.
  5. Benchmark dimension, containing a list of benchmarks which the portfolios are measured against, including blended benchmarks. The “pairing” between portfolios and benchmarks is not done in this dimension.
  6. Country dimension covers two country code schemes (ISO 2 chars) and (ISO 3 chars) and 2 to 3 country name columns (one for each OMS/IMS, plus ISO).
  7. Currency dimension is pretty static. A country column is not useful here, as the currency name already includes the country and some currencies are used by several countries. It is usually a good practice to update country and currency dimensions manually.
  8. Rating dimension. Covers Fitch, Moody’s, and S&P. A good rating dimension have the 3 agencies as columns so we get the translation between them (instead of stacking them up as rows with a “type” column). Also covers “major” rating and “notched” rating as two different attributes/columns (AA is a major and AA+ is a notched). Also covers cash ratings (Moody’s), which is quite different from credit rating. Don’t put the “watch” rating and “outlook” here, they should be a DD on the security dimension.
  9. Industry Sector dimension. Covers at least ICB and GICS, along with major providers such as ML, Barclays, Bloomberg. Some are 4 levels deep, some 7 levels. Unlike rating, don’t try to do translation between providers here, so stack them up with a “provider” column, then level 1 to level 7 columns.
  10. Asset class dimension (aka asset type, instrument type, asset group) which contains a list like: equity, fixed income, cash, property, commodity as level 1. Level 2 contains the breakdown for example for fixed income the level 2s are bond, floating rate note, CDS, bond future. For bond the level 3s are callable/puttable or government/corporate.
  11. Dimensions related to performance return attributes: return type (gross or net of fees), accounting method, cumulative or annualised, arithmetic or geometric. We could model them as a junk dimension because most attributes only have 2-3 values, if we don’t mind losing the description. I don’t recommend the performance period to be put here (e.g. 3 month, 1 year, 5 years), best to put them as columns in the performance return fact table (having them as a column in the performance return is second best).
  12. For the sake of completeness, date dimension.

A.3. Issues

The usual issues/dilemmas with designing the dimensions are:

  1. Do we include country of risk/incorporation, currency, industry sector, etc. in the security dimension? Absolutely, as type 2 attributes. Plus we also have them as separate dimensions in their own right.
  2. How do you model derivative instruments which has very different attributes? By creating Option, Swaps, Futures and CDS as separate dimensions. Don’t separate the OTCs (Over The Counter) Derivatives, e.g. options, swaps, CDS) from ETDs (Exchange Traded Derivatives, e.g. futures, forwards). Don’t create too many dimensions for derivatives, but group them up. 4 or 5 dimensions sounds about right. Don’t worry too much if certain attributes are duplicated between the main security dimension and the derivative instrument dimension, it’s ok.
  3. Do we put cash in the security dimension or not? Don’t. Only real financial instruments should be in the security dimension (including CD, Repo, OTC and ETD), but leave settled and unsettled cash out (including collateral cash for margin requirements) See my article here for more info: link. Cash in different currencies should be put directly in the position fact table, clearly labelled as settled or unsettled as many portfolio management decisions hang on this label.

B. Details – Fact Tables

B.1. Position Fact Tables

Position fact tables store the valuation for each position from the back office accounting system, including adjustments. The valuation is daily, but the one at the month end is continuously adjusted / restated until a few weeks later.

For example, every business day we could be receiving:

  1. Daily position valuation from the OMS system
  2. Daily position valuation from the accounting system
  3. Today’s version of the last month end position valuation from the accounting system

The valuation is not only on the securities that we hold, but also the cash. There are a few things we need to be aware here, with regards to the validity of our position fact table:

  1. The cash amount is dependent on the settlement status of the transactions for that day.
    For example, this morning we had £1m cash, and this afternoon we bought £300k worth of equities and bonds. So what’s our cash position at the end of today? Answer: Settled cash = £700k, unsettled cash = £300k.
  2. The securities we hold is affected by the trade status.
    For example, today the portfolio manager put an order to buy £200k of bond X. If the status of the order is “executed by the broker” (it’s called “filled orders”) then we will have that security in today’s position. But if the status of the order is “not yet executed by the broker” then we don’t have that security in today’s position. There are many stages of an order/trade including “draft/created”, “verified”, “authorised”, “compliance checked”, “placed with/sent to the broker”, “confirmed by the broker”, “filled/executed by the broker”, “allocated”, “settled”, and “closed”. The position is created when the order is “executed”.
  3. The value of an equity position is the number of shares we hold x the previous day price of that equity (close of business price, e.g. when the exchange ended trading – the exact hour depends on which country the exchange is located). Like wise of bonds and any exchange traded securities. But for OTC derivative positions like options and swaps, the valuation requires some calculations to determine the “fair” market value, e.g. Black-Scholes model for options (this discipline is generally called “derivative pricing”). This is because OTC derivatives are not traded on exchanges, but a contract between two parties.
  4. Some of the positions could be valued in-house, and some positions could be valued by an external company to whom we outsource our back office operation. In this case we may need to merge two sources to get the holistic view.

Design of Position Fact Tables:

  • Name: Fact Holding, Fact Valuation or Fact Position. “Holding” restricts us from including benchmark positions. “Valuation” restricts us from including fixed income analytics. So in my opinion “Position” is the best name.
  • Grain: 1 row for each position in every portfolio and benchmark per day.
  • Portfolio and benchmark: two separate position fact tables would cause all sort of painful issues when joining, so put into 1 fact table but join on the front end OMS (Charles River, Aladdin, Simcorp, ThinkFolio). Do not join portfolio and benchmark positions in the DW. This is probably the most important point in the this article.
  • Front end and back end: put data from back end (State Street, Fund Accounting, Northern Trust) and front end (CRD, Aladdin, TF) in two separate position fact tables. The former contains accounting/finance valuations and the latter contains daily trade-by-trade valuations.
  • Surrogate keys: date key (called reporting date), portfolio key or benchmark key, security key. Those are SKs which make up the grain. Other SKs are: currency key, client key.
  • History: stores 5 years of daily snapshot.
  • Partition: partitioned on reporting date.
  • Indexing: index on date key, portfolio key and security key. Also index on other SK columns.
  • Restated month end valuation: stores in a separate fact table.
  • Measures: valuations of each position in 3 currencies (portfolio, base and local) and their implied FX rates, fixed income analytics such as durations, yields and spreads, maturities (in years, 4 dp).
    Note: Analytics numbers are from front end OMS. Fund accounting/finance/back office only provides valuation numbers.

Loading: daily file from back office outsourcing/finance and daily export from OMS.
If the valuation of OTC derivative positions is conducted in house rather than outsourced, this needs to be merged into the back office data.

Usage: correct security key is vital as this fact table is the main source for reporting aggregated positions (portfolio breakdown) by currency, asset class, country, industry sector, rating, etc. which is required by both clients and regulator. Mostly usage are only interested in the month end, but settled positions. Some requires executed (unsettled) positions.


  • Lookthrough: store the positions without a lookthrough. A lookthrough position is a fund held in the portfolio, usually our own funds. In a portfolio, in addition to holding cash, equity or fixed income position, we also hold a fund. We should not “expand” the “content” of this fund. Instead we should store the value of the whole fund. If we “expand”, the valuation of other positions become incorrect, particularly cash positions.
  • Unsettled cash positions: in the Position fact table, separate unsettled cash positions from settled cash. The separation and calculation of the unsettled cash should be done within fund accounting, not within the DW.
  • Mid day vs end of day valuations: always take end of day valuations, even if we do the valuations in house (e.g. for options and swaps).

B.2. Performance Fact Tables

Performance is calculated in the back office system such as Stat Pro, Barclays Point (now Bloomberg Port), MSCI BarraOne, or outsourced to back office provider such as State Street or Northern Trust (Passport).


  • Name: Fact Performance Return (to differentiate from Performance Attribution).
  • Grain: 1 row for each share class, per month, per return type (gross or net return), per return calculation (arithmetic or geometric return), per data source, per row type (portfolio/benchmark/excess return), per cumulative type (annualised or cumulative), per price/total type (price return or total return which is price+income).
  • Surrogate keys: reporting date key, account key (1 account for each share class), benchmark key, currency key, return type key, return calculation key, data source key
  • Degenerate dimensions: performance row type (P for portfolio, B for benchmark, E excess return = R for relative return = A for active return), cumulative type (A for annualised, C for cumulative), performance total/price type (P for Price Only Returns, T for Total Return which is Price + Income), return calculation (A for arithmetic, G for geometric).
    Alternatively we can also put all these DD columns in a junk dimension.
  • Measures: 1M, QTD, 3M, 6M, YTD, 1Y, 3Y, 5Y, 10Y (M = month, Q = Quarter, Y = Year). Some asset managers also have 2Y and 15Y.
  • Loading: from StatPro/Simcorp/Aladdin output or files from the back office profider such as NT or State Street.


  • Some have MTD (month to date), which stores the daily intra-month returns.
  • A popular alternative design is to create a period dimension, containing 1M to 10Y, effectively making this fact table a vertical fact table. I prefer to have a separate measure column for each period, because the first thing we do when we have a vertical performance fact table is to pivot it so that the 1 column of numbers becomes 1 row of numbers. Also because the number of periods are relatively fixed.
  • Performance numbers have to be signed off. Before sign off the data is available, and can be used for reporting, but there a (small) risk that the data is incorrect – which in some cases is acceptable. So we need to add a DD column (or part of the junk dimension) indicating the sign off status.
  • Another attribute which should be added to this fact table is the sign off date (as a date key).

Performance Attribution Fact Table

The active return or excess return or relative return is the return of the portfolio minus the return of the benchmark. This active return can be attributed/allocated to several factors, i.e. yield curve, duration, security selection, currency, credit, carry and sector allocation. This is called performance attribution data. For example, a relative return of 6% can be attributed to:

  • 2% because of duration
  • 1% because of security selection
  • 2% because of sector allocation
  • 1% because of carry


  • Name: Fact Performance Attribution
  • Grain: 1 row per portfolio, per reporting date.
  • Surrogate keys: reporting date key, portfolio key, benchmark key.
    Additionally: currency key.
  • Measures: yield curve, duration, security selection, credit, carry, sector allocation, and total.
  • Loading: from StatPro/Point/Barra output or files from the back office profider such as NT or State Street.

B.3. Risk Fact Tables

The risk fact tables store the sensitivities numbers and value at risk. They also stores the results of “stress test” on multiple scenarios, such as European sovereign default, global economic recession, oil price fluctuation, tightened credit markets, liquidity squeze, exchange rate fluctuation, Brexit scenarios, raise of US interest rates.

I’ll start with sensitivities fact table, the VAR fact table, then stress test fact table. Note: IE01 reflects how sensitive the portfolio is to inflation. PV01 reflects how sensitive the portfolio is to interest rates.

Sensitivities fact table:

  • Description: combine IE01, PV01 and DV01 in one fact table, storing them as different measures.
  • Name: Fact Portfolio Sensitivities
  • Grain: 1 row per portfolio, per day.
  • Surrogate keys: calculation date key (when the sensitivities were calculated), portfolio key
  • Measures: IE01, PV01, DV01. Some asset managers also stores CR01 and IR01.
  • Loading: from risk system such as IBM Algo Risk, MSCI BarraOne, MSCI RiskMetrics, Simcorp. Daily load of the output files.
  • Noge: IE01, PV01 and DV01 also calculated at position level. Put them in the position fact table. Keep the ones in portfolio sensitivities at fund/portfolio level.

VAR fact table:

  • Description: stores value at risk (VAR) at portfolio level, and at sector and country levels. Not that VAR measure is not additive, so we can’t aggregate the VAR of all European countries to calculate the VAR for Europe. So we need to store the VAR at country level and at region level. Ditto for the sector. A VAR measure depends on the confidence level (i.e. 95%, 99%, 99.9%, etc.) and the time horizon (i.e. 1 day, 10 days, etc.) For example: on 20th Nov 2018, the one day VAR of this portfolio is $1 million with a confidence level of 99%. Which means that the possibility of this portfolio losing $1m or more in one day is 1%. Whereas “10-day VAR with a confidence level of 95% is $4m” means: the possibility of losing $4m in 10 days is 5%.
  • Name: Fact Portfolio VAR
  • Grain: 1 row for each sector/country, per portfolio, per day, per horizon, per confidence level.
  • Surrogate keys: calculation date key, portfolio key, sector/country key
  • Measure: because the time horizon is fixed at 1 day and 10 days, and the confidence levels are fixed at 99.9%, 99% and 95%. I prefer to have the time horizons and confidence levels as columns so we have 6 VAR measures / 6 columns like this: VAR_1D_999, VAR_1D_99, VAR_1D_95, VAR_10D_999, VAR_10D_99, VAR_10_95. This saves us from pivoting the result all the time.
  • Rows: the rows in this fact table are one of the following 5 types:
    1. Portfolio Level
    2. Country Level
    3. Region Level
    4. Industry Level (we pick 1 scheme, such as Merrill Lynch or Barclays)
    5. Sector Level
  • Note: the tricky thing here is storing the sector or geographical segments because the measure is non-aggregatable. We do have SKs for each country, but how about the regions? Should we create rows for Asia, Latin America, Europe, etc. in the country dimension? No. We must not change the grain of a dimension, by adding higher level rows. The best way I find is the create a separate (new) VAR Segment dimension, which has these columns: Segment Scheme (value: Country, Region, Industry or Sector), Segment L1, Segment L2, Segment L3. The rows with Segment Scheme = Country have the L1 populated with country, L2 with region, L3 blank. The rows with Segment Scheme = Region have the L1 populated with region, L2 and L3 blank. The rows with Segment Scheme populated with Industry have the L1 populated with Industry, L2 with Sector, L3 with Sector Group.

Why not separating the VAR fact table as 3 tables? 1 for portfolio level, 1 for country (and region) level and 1 for industry (and sector) level? Because they are not aggregatable anyway, we can’t sum the rows. So might as well put them all in one table.

One note with regards to ML sector or Barclays sector, or GICS sectors, etc.: it is not a problem obtaining the classification scheme i.e. how different industries are grouped into sectors and super sectors. The problem is to get the data of 20,000 companies in the world, each of them is allocated into a sector. Is that company manufacturing or retail? Obtaining this data is the issue. Not just for 100 or 1000 companies, but for 20,000 public companies in the world, that is the issue. So in reality we will use the industrial classification used by our security data provider, i.e. if we are sourcing the securities data from Bloomberg then we use the industrial classification provided by Bloomberg. If we source the securities data from Thomson Reuters or Factset, then we use the industry data from them.

This is part of security/instrument mastering process. I mean attaching the ML, GICS (MSCI and S&P) or Barclays sector into each security. And attaching ratings into each security. And attaching country of risk and country of incorporation into each security (and country of domicile).

We have discussed the VAR on industry sector and on geography (countries). How about the carving-up of the VAR based on the asset classes and ratings? These two are not as common as the industry and countries, but it is possible to run them pass the risk system and get an output. If this is the case in your company, then we put them into the same VAR fact table above, with the VAR Segment Key column pointing to rows in the VAR Segment dimension for asset classes (such as IRS, CDS, bond and equity) and ratings (such as AA-, BBB+, etc.)

B.4. Transaction Fact Table

Background: Internal and external orders

There are two types of order: the one which the fund managers put in to our trader (this is called “internal-facing”) and the one which our trader put out to our brokers (this is called “external-facing”). Within our company, it is possible two fund managers (or more) put an order for the same security (on the same say or thereabout), and our trader combines them into one external order. After the order is executed, the trader allocates the security into two portfolios. The opposite can also happens, i.e. one order from a fund manager is very big so the trader decided to split it into two smaller external orders and put them to separate brokers to get better price. After the orders are executed, the trader combines the two external orders and put them both to the same portfolio belonging to that fund manager.

The transaction fact table here stores the internally-facing orders. Not the externally facing orders.

Background: order lifecycle

  • A member of the desk (e.g. a financial analyst, or the fund manager) sends an email to a trader asking them to buy or sell an exchange traded security, or make an OTC derivative deal. He specifies the quantity and the price boundary (a range of price where the desk is prepared to except) and the timing (must be same day, or can it be next day, or even day after next). If is a bond, he specifies the issue, i.e. the maturity and coupon. If it is equity, they specify the class (e.g. Alphabet class A or C, with or without voting rights), the ADR (for US), HK or Shanghai (for Chinese equity). If it’s a derivatives they specify the terms of the deal, e.g. reference rate, maturity, etc.
  • A trader enter the order into their blotter, recording the security (e.g. ISIN/SEDOL), maturity, coupon, dates, class, etc. and to which portfolio this order is for. The trader enter the order on the Order Management System (OMS), aka Investment Management System (IMS) and the order status is Draft.
  • When the order is created, various compliance checks automatically run on the OMS for that portfolio, and if this order breaches one of the rules (be it mandate or regulatory), OMS will block this order and it cannot continue to the next stage.
  • If the order passes the compliance check, the manager of that fund then get notified automatically, asking them to approve this order (known formally as “authorise”). The fund manager checks the impact of this order to her/his portfolio e.g. the valuations, the risks, and the portfolio characteristics such as duration, and then authorise the order.
  • In the mean time the trader tries to find the best price by comparing quotes from several brokers. If it is an OTC deal, the risk system automatically checks the counterparty risk level of the broker. If the risk level is breached, this order is blocked and can’t continue.
  • Once the order is authorised, and the risk + compliance systems are both green, the trader go ahead with one of the brokers (or two) and place an order with them. At this point the status of the order becomes “sent”. There is a new MiFID regulation with regards to best execution, i.e. an asset management company must execute every order to the best of the client’s interest, not the company’s interest. This means finding the best price amongst the brokers and this is reported to the regulator.
  • The broker then send the order to the exchange, to find a match on the opposite side, i.e. if it is a sell order they find a buy side, and vice versa. Once a match is found and agreed the order becomes a trade, and the order status becomes “filled”. The exchange sends the trade details to the broker i.e. who the other side is, and at what price the trade happened. The exchange also sends the trade details to the custodian i.e. the bank who holds the securities on behalf of the asset manager (usually also do settlements and administration such as dividend and coupon payments).
    Note that trade means both sides whereas order means one side.
  • The broker informs our trader that the trade has been executed, along with the trade details such as price, who the opposite side is and the settlement date. This is called “confirmation”. The broker confirms all the trade details with the trader. The broker on the other side also confirm the trade details with their client. Any incorrect details would be corrected at this stage.
  • Two days after the trade was made, we settle the trade, known as T+2 (some securities are T+1 or T+3). If we are buying, our back office will transfer the money to the clearing house, and the clearing house sends the security to our custodian. If we are selling, our custodian will send the security to the clearing house and our back office will receive the payment. Often the custodian bank offers settlement services, which means that, instead of the back office making and receiving the payments, it is the custodian bank who is making and receiving payments to and from the clearing house. After this stage, the status of the order becomes “settled”.
  • Then the securities is allocated into the relevant portfolios (if we are buying, and the trader combines several internal orders into 1 external order). At this point the status of the order is “allocated”. But, the new security was put in the portfolio way before “allocation”. The security is put as “holding” in the portfolio (and therefore affects the value of the portfolio), when the order status is “filled”.
  • After the order is allocated, the order is closed. The order status at that point becomes “closed”.
  • Before an order is executed (order status = “filled”), it can be cancelled and the order status becomes “cancelled”. After the order is filled, the order can’t be cancelled as both sides have committed*. If the broker made a mistake, e.g. the buying the wrong security, then the broker pay for the wrong security out of their pocket (which they will sell later) and then buy the correct security for us the AMC.
    *Note: The exchange usually allows cancellation, as long as both sides agree. But the opposite side of the trade rarely agree to the request to cancel the trade. Correction, yes (at confirmation stage) but cancelling, usually no.

Ok that’s a lot of background. Now let’s specify the design of the fact table.

Transaction Fact Table

  • Description: stores every order for every portfolio. In the transaction fact table we do not store the other side of the trade. We only store our side of the trade. We store the quantity, price, value and the dates of every stage.
  • Name: Fact Transaction
  • Grain: 1 row for every order
  • Fact Table Type: accumulated snapshot
  • Surrogate Keys: Transaction Key (aka Order Key), Order Status Key, Broker Key, Trader Key, Portfolio Key, Security Key, Exchange Key, Authoriser Key, Order Created Date Key, Authorised Date Key, Sent Date Key, Filled Date Key, Confirmation Date Key, Settlement Date Key, Cancellation Date Key, Closed Date Key.
  • Degenerate Dimension: Order ID (internal), Order Type (B for Buy and S for Sell), Trade ID (external)
  • Measures: Quantity, Trade Price, Order Value, Traded Yield, Traded Spread
  • Loading: once a day from the “trade file” exported by the OMS at the end of the day (about 11pm London time, after the US market is closed). Just like position file. This trade file is loaded to the data warehouse straight after it is produced by the OMS. All orders which are not “closed” are “loaded” into this fact table: new orders are inserted, existing orders are updated.

Note: Bear in mind that 2 internal orders (2 order IDs) can be combined into 1 external trade (1 trade ID). And 1 internal order (1 order ID) can be split into 2 external trades (2 trade IDs), each going to different brokers, each has different opposite side. The grain of this fact table is 1 row per internal order in the OMS. If 2 orders are combined into 1 trade, 2 rows will have the same trade ID. If an order is split into 2 trades, only the trade ID of the primary trade is stored here on that row (the largest trade, by value).

Unlike in the retail industry (supermarkets, shops) where an order usually contain multiple items, in OMS an order contains one security.

External Trade Fact Table

Now that we know what a transaction fact table is, we can create an external trade fact table which stores externally facing trades.

  • Name: Fact External Trade
  • Grain: 1 row for each trade
  • Fact Table Type: Accumulated Snapshot
  • Surrogate Keys: Trade Key, Trade Date Key, Exchange Key, Broker Key, Trade Type Key, Security Key, Counterparty Key, Confirmation Date Key, Settlement Date Key.
  • Degenerate Dimension: Trade ID
  • Measures: Quantity, Trade Price, Value
  • Loading: once a day from the OMS export

C. Dimensions

C.1. Security Dimension

A security is a financial instrument, such as shares/stocks of a company. The followings are also securities:

  • Bond: an instrument where we get paid coupon (interest) every 6 months
  • Option: an agreement to buy or sell shares or bonds (or currencies, or indexes)
  • Future: an agreement to buy or sell bonds or commodities in the future
  • Forward: like Future but for currencies or commodities
  • Swap: an agreement to exchange income from a security
  • Foreign Exchange (FX): an agreement to buy a currency
  • Credit Default Swap (CDS): an agreement to get compensation if a bond is defaulted (the issuer can’t pay the coupon/interest)
  • Loan: an agreement to borrow money with interest
  • Securitised: a bond backed by a collection of loans or mortgages
  • Certificate of Deposit: a certificate from a bank which indicates that we have deposited a large sum of money in that bank

Note: the above is called “asset class”.

The followings are not a security:

  • Cash, i.e. money in different currencies (but an option to buy money in other currency is a security)
  • Account, i.e. we put money into a broker or a bank (e.g. for margin trading)
  • Real estate, i.e. property such as a house or a land
  • Commodity, such as oil, cotton, iron and copper (but the a future on commodity is a security)

In the broadest sense securities can be grouped into two: bond or equity. Bond is formally known as “debt”, to include other fixed income securities outside bonds, such as derivatives (options, swaps and forward).

A security can be listed in an exchange. For example: a share can be listed in a stock exchange, a future can be listed in a future exchange. A security can be negotiated just directly with an investment bank (they are called the “sell side”, we are called the “buy side”) and this type is called OTC, stands for over-the-counter, meaning that we make a deal outside the exchange. OTCs are mostly swaps and options, but an equity and future can also be bought outside the exchange (directly from the seller, which is mostly banks). This attribute is called security type, i.e. listed or OTC. Listed security is also called ETS, stands for Exchange Traded Security, or ETD, stands for Exchange Traded Derivatives (if it is a derivative).

Security Master application

An asset/investment management company usually has an instrument mastering system, aka Security Master application. This application enables the creation of a security (financial instrument*) in real time whenever the following event occurs (or before hand):

  1. A benchmark contains a new security (not already exists in security master).
  2. An order contains a new security.
  3. An OTC deal/trade is made.
  4. A file from ESG vendors contains a new security

*Note: cash is a financial instrument but not a security (ditto deposits).
Asset is everything that is held in a portfolio, including cash, deposits and derivatives.

Security Master

The Security Master application stores every security used in the company. Equity, bonds, options, futures, swaps, loans, certificate of deposit, treasury, money market, ETF, everything. Old securities which have matured are also stored here. Every security held in every portfolio is stored here. Every security held in any benchmark used by the company is stored here. Every security resulting from any over-the-counter trade is stored here (e.g. IRS, CDS, FX option). Every ISIN or Sedol mentioned in any ESG file from any ESG vendors (e.g. Bloomberg, Trucost, Sustainalytics, MSCI, ISS, Factset, BoardEx, CDP). The ESG data is actually per company, not per security. But, next to the company name, the vendor put the primary ISIN or Sedol of that company (usually the common equity).

The primary source of a Security Master application is the Bloomberg security universe. There are about 650,000 securities there, but an investment/asset management company usually only request the actively traded ones, which is about 67,000. This way, when fund managers are considering to buy a new equity or bond which is not in the index, they will find it in the OMS. Because the OMS takes the security from the Security Master and the Security Master has every actively traded security, be it shares, bonds, treasury, listed options or listed futures.

Of course the Bloomberg security universe does not contain OTC derivatives*, because an OTC derivative is a security created by two companies making a deal between them to buy and sell an option or a swap. They are not listed in any exchange. Therefore the OTC system needs to connect to the Security Master.
*Note: are there OTCs which are not derivatives? Yes, an equity can be bought outside an exchange.

The main purpose of a Security Master app is to enrich the “skeleton” security with various attributes (usually from Bloomberg) such as security name, country of risk, country of incorporation, country of domincile, currency, rating, issuer, exchange, and industry sectors, along with various identifier such as ISIN, Sedol, Cusip and Ticker. Security Master also have screens/pages where user can manually alter the attributes of a security.

Security Dimension

The ideal source of a security is the security master application where the creation, integration and enrichment happens, but if it is not available (usually in the case of a small asset manager), then security dimension becomes the creation and integration point. Creation point means that new securities such as OTC are created in the security dimension (skeleton only) and then enriched by making a “per sec” call to Bloomberg to get the remaining attributes. I have to emphasise that a security dimension is not the right place for mastering security in the company, because a) the mastering process requires a lot of manual checking, and potentially manual entries, b) it requires judgements with regards to what value we want to allocate to security, when there are conflicting values coming from different data sources.

Dimension Design:

  • Dimension Name: Dim Security
  • Dimension Type: SCD Type 2
  • Grain: 1 row for each version of a security
  • Business Key: Security ID (from Security Master)
  • Surrogate Key: Security Key
  • Description: contains listed and OTC securities but not cash, funds or deposits. This is a “core and custom” dimensions, i.e. Dim Security as the core dimension contains about 40-50 common attributes across asset class and Dim Security Equity/Fixed Income/Swap/Option/CDS as custom dimensions covering attributes specific to those asset classes.
    Note: The “core and custom” dimensions is also known as “supertype and subtype” dimension, see Jeremy Kashel’s explanation here on Adatis blog and a short description here on Kimball group website. Also see the Learn Data Modeling post on this here and Oracle Academy presentation here.
  • Number of rows: about 500,000 to 1m (30-40,000 distinct securities, 10-20 version each)
  • Number of columns: about 40-50
  • Attributes: Security ID (from Security Master), Security Name, Description, ISIN, Sedol, Cusip, BBG ID (Bloomberg), Ticker (Bloomberg), FSYM ID (Factset), RED ID (Markit), RIC (Thomson Reuters), Master Issue ID (Moody’s), Exchange, Issuer, Currency, Country (Risk, Domicile, Incorporation), Market, Industry (GICS, ICB, ML, BBG, Barclays), Issue Date, Security Type, Maturity Date, Underlying Issuer, Asset Type/Class/Group.
  • Security attributes not included: Series, Tenor, Reference Name (these are CDS attributes), callable, call dates, puttable, put dates (these are Option attributes), Nominal, Principal, Coupon, Debt Type, Convertible Flag, Perpetual Flag, Securitisation, Zero Coupon, Sinking Fund, Credit Ratings, Reference Yield Curve (these are fixed income attributes). These attributes are stored in Dim Security Fixed Income, Dim Security Equity, Dim Security Swap, Dim Security Option and Dim Security CDS
  • System columns: inserted datetime, last updated datetime, latest flag.
  • Loading: from security master. Rows in the security master is already deduplicated, meaning that it’s 1 row per security. So we can take an export of the security master, and apply the SCD type 2 process on the attribute that we like to keep the historical value.


  • Dimension Name: should it be called instrument dimension or security dimension? Either one is fine. Financial instrument includes cash and deposit, whereas security doesn’t. Is oil a financial instrument? No. Is it a security? No. Commodities like cotton, wheat, soya beans, iron ore, copper and oil are not financial instruments or securities. But the options on them are financial instruments and security.
    Gold is not a financial instrument. A loan is a financial instrument, because it is tradable. An FX option is a financial instrument because it is a trade (a deal) between 2 parties. Property is not a financial instrument.
    The definitive definition of a financial instrument is given by the Securities Act (link) and the IMF: link.
  • The same CDS is issued by several banks, for example a 5 year CDS on a Tesco bond can be issued by JP Morgan and Citi Group. Tesco is the reference entity, whereas JPM and Citi are the issuer. So the “issuer” of a CDS is different from the issuer of a bond or shares.
  • Licence on identifiers: to use the proprietary security identifier listed above (FSYM, BBG ID, RIC, etc) we need to pay a licence fee. Ditto Cusip (link). We can’t store them without paying first. But not ISIN and Sedol, as far as I know.
  • Ticker: Each vendor has their own ticker, e.g. Bloomberg tickers are different from Factset tickers, and are different from Reuters tickers. They are all licenced (not free). For example: The ticker for PT Telekomunikasi Indonesia are TKLM:IJ (Bloomberg) and TLKM:JK (Reuters). The Exchange Ticker code (known as EPIC or TIDM in UK) is free to use as far as I know. For example: NYSE: T, LSE: LLOY, ETR: BMW, NASDAQ: AMZN and IDX: TKLM.
  • The shares of a company can be traded on different exchanges under different ISINs. For example, RBS is traded in London (ISIN: GB00B7T77214) and New York (ISIN: US7800976893, as ADR).
  • OTC swaps such as IRS are stored in the security dimension as one row, not 2 rows. Most of the attributes are stored in Dim_Security_Swaps. Only the core attributes are stored in Dim_Security. In Dim_Security_Swaps IRSes are stored as 2 rows: 1 for the fixed leg and 1 for the floating leg. It is possible that both floating, in the case of multi-currencies swap.

C.2. Issuer Dimension

The issuer dimension stores four things:

  • All public companies which issue shares or bonds.
  • All central governments, municipal governments and government related bodies/agencies which issue bonds.
  • All the investment banks which issue credit default swaps (CDS) and certificate of deposit (CD).
  • All companies which have ESG data (both public and private)

Dimension Design:

  • Dimension Name: Dim Issuer
  • Dimension Type: type 2
  • Grain: 1 row for each version of company, government, or agency
  • Business Key: Bloomberg Company ID, or Factset Entity ID, or Markit RED ID, or LEI, or DUNS ID.
  • Surrogate Key: Issuer Key
  • Attributes: Issuer Code (one of the business keys above), issuer name (legal name) – this is type 1, issuer short name, several entity identifiers (BBCID, RED, FSYM, LEI, DUNS) – this is type 1, several primary security identifiers (primary ISIN, primary SEDOL, primary CUSIP, primary ticker, primary exchange) – this is type 2, issuer type (public company, private company, central government, municipal goverment, or government agency) – this is type 2, headquarter address, telephone, contact person, country of risk (primary country of operation), country of domicile, country of incorporation (all countries are stored as 2 character ISO codes, not as surrogate keys), parent company ID (this is the business key e.g. BBG CID, not the surrogate key), industry sector (several columns, i.e. GICS, ICB, SIC, NAICS, ISIC, ML, BBG, Barclays).
  • Attributes not included: CEO (this is stored in the Executive Dimension because it date dependent), market cap (this is stored in a fact table because it is date dependent), number of employees (date dependent – should be stored in a fact table)
  • System columns: inserted datetime, last updated datetime, latest flag.
  • Loading: a daily export of the issuer table from the security master get loaded into Dim Issuer, with SCD type 2 process.


  • Name: we should not call this dimension Dim Company because the governments are not companies.
    Dim Entity or Dim Legal Entity is possible but it is not popular nor clear.
  • Other entity identifiers are listed on Alacra (link). Most of them are either sectoral (e.g. only banks) or geographical (e.g. only Japan).
  • The main challenge in populating this dimension is to make sure that each company is only inserted once. This matching process (to identify if a company already exists) thankfully does not happen here in the issuer dimension, but in the security master system.
    The security master system, after creating a new security, it populates the issuer code and underlying issuer code for that new security. At this time, it tries to find that issuer in its internal issuer table, using the natural key (e.g. RED, FSYM or BBG CID). Because issuers are already deduplicated in the security master, we in the DW team just need to consume it.
    But the ESG data sources will have companies not in the “investment” data space. They need to be added into the security master, and come out into the DW.

C.3. Client Dimension

A client is from whom we receive a mandate and money to invest on their behalf. It could be a segregated mandate, which means that we manage their money separately, i.e. in a separate portfolio. It could be a pooled fund, which means that we put their money into one of our existing fund, and manage it together with the money from other clients. So in this sense, our clients are companies or government bodies.

The client dimension is usualy built from Salesforce which stores the detail of every single client and prospect. Most of the asset managers that I know use Salesforce for their CRM (Salesforce is used by many other industry too, e.g. banking and insurance).

Many asset managers are institutional, e.g. they only have corporate clients. In this case the client dimension will contain only companies and governments/goverment bodies. Some asset managers are retail, e.g. they have individual investors as well as corporate clients. In this case the client dimension will contain companies, government/goverment bodies and individual people. Companies have different attributes to people. Companies have different attributes to governments too, but not as different as people’s attributes, i.e. companies and governments and government bodies/agencies can be put into one table.

The question is, should we create two client dimensions, one for company/government and one for individual people? Yes we should. The retail client have different requirements, different attributes and different processes to the corporate clients. And they access different funds too.

Within corporate clients, we have a few major categories: insurance clients, pension funds, consultants, corporate investors, and government/sovereign clients.

Insurance clients usually request the following Solvency II related reports: Pillar 3 QRT (Quantitative Reporting Templates), Tripartite template, capital charge calculation with risk pillar attribution, and capital charge analysis.

Whereas pension fund clients usually only needs the following (insurance clients need these too): asset valuation (holding level and higher level aggregations, e.g. sector, rating, duration), performance attribution, performance return. All of the above are comparing the portfolio to the benchmark. Plus they also need market review, market outlook and trades report.

Dimension Design:

  • Dimension Name: Dim Client
  • Dimension Type: type 2
  • Grain: 1 row for each version of a client
  • Business Key: Salesforce Account Number
  • Attributes: Client Name, Address (including post code and country), Billing Address (if different), Telephone, Website, Main Contact Person, Main Email Address, Account Manager, Industry, Client Type, Client Sector, Account Source, Parent Company ID, Date became client, Date of first mandate.
  • System columns: the usual 3.
  • Data source: from Salesforce.

If we are administering our own retail funds, we need to create a subscriber dimension, which contains the individual investors. We need to install and use a fund administrator software, and the subscribe dimension is populated from the export of the fund administor system. If the fund administration is outsourced to a third party, we get our subscriber data from the third party.

Dimension Design:

  • Dimension name: Dim Subscriber (or Dim Investor)
  • Dimension Type: type 2
  • Grain: 1 row for each version of a subscriber
  • Business Key: Subscriber
  • Surrogate Key: Subscriber Key
  • Attributes: Subscriber name, Address, Date of birth, Email address, Telephone number, Date of first subscription.
  • Data source: fund administration system

It is rare that we need to create this Dim Subscriber because we are not dealing with the day-to-day subscription (people putting money in) or redemption (people pulling their money out), or income distribution (allocating the dividends from the company in the fund to the subscribers). The day to day task of administering a fund is usually outsourced. If we do manage the day to day fund administration, then we also need to create a fact table for subscriptions and redemptions, and any other movement of money within the funds.

C.4. Portfolio Dimension

In layman term, a portfolio is a collection of shares or bonds that we manage as one entity. In addition shares and bonds, we can also have derivatives such as options, futures, swaps, FX, CDS, securitised, CD, etc (see section C1 above where I explained about various securities), and cash.

A typical equity portfolio consists of 30 to 60 shares occupying 98% of the fund, with the remaining 2% as cash. It is open to the public, meaning that individual investor can in that fund. The fund could be available in different share classes, i.e.

  • In different currencies such as USD, EUR and GBP
  • With different distribution type, i.e. accumulation or income. Accumulation means the dividend from the shares is reinvested into the fund. Income means the dividend from the shares is distributed to investors.
  • For different target invetors, i.e. institutional or retail investor. One share class might have minimum investment of £100 because it is for retail investor, whereas another share class might have minimum investment of £100,000 because it is for institutional investor.
  • With different charges, i.e. one share class having Initial Charge of 5% and Ongoing Charges Figure (OCF) of 1%, whereas another share class has Initial Charge of 4% and OCF of 1.5%. OCF includes expenses that we pay for administrator, accountant, auditor, legal, depositary and registrar.
    A depositary bank (with an o in the US, depository) holds the shares in the fund and distribute the dividends from the shares. It is also known as custodian bank or fiduciary.
    A registrar keeps a register of shareholders of the fund (including AML and KYC) and produce capital account report and distributor position report. For a fund the correct term is “subscriber”, for a unit trust it is “shareholder”. AML stands for Anti Money Laundering, to check that the money from the subscriber is from valid source. KYC stands for Know Your Customer, to check the identity of the subscriber to make sure they are who they say they are, and investigate if they are polically connected or domiciled in restricted countries.
  • With different performance fees, i.e. one share class may charge a performance fee, whereas the other share class doesn’t charge a performance fee. For example the investor pays 10% of the amount by which the NAV of the fund exceeds the FTSE Allshare index.
  • ISA compliant, meaning whether individual investors can keep this fund in their Individual Saving Account (ISA) or not.
  • SIPP compliant, meaning whether individual investors can keep this fund in their Self Investing Pension Plan account (SIPP) or not.

A fund can be open ended, meaning that when there is a new subscriber they issue new shares (or unit of shares, to be precise) and when a subscriber sells their holding their shares are sold (called “redemption”). Or it can be a closed ended fund, meaning that the number of shares is fixed, and the fund is listed in the stock exhange so investors can buy and sell it.

A fund has specific objective, which is to invest in certain markets, in certain financial instrument. For example, only equity or only bonds (or both). Only in Europe or only in Asia (or global). Only small companies or only big companies (or both). Only emerging market or only developed market (or both). Only in bio technology companies only in financial services companies, or only in mining companies (or all industry sectors).

A fund has a benchmark, for example to outperform FTSE 100 GBP Total Return Index. Total Return means that the dividend is reinvested back into the index. FTSE 100 is the 100 largest companies listed in London stock exchange.

So that is one end of the spectrum: a portfolio which is a fund. The other end of the spectrum is a portfolio which is a segregated mandate. We receive a mandate (instruction) from the pension fund trustee of Supermarket ABC to invest £20 million of their money in a certain way, for example: 80% of it must be invested as fixed income in the US and European markets separately, we must not take any short position and we must not use derivative other than FX options to hedge currency risk. We create a client account called Supermarket ABC, and under this account we create a portfolio called “Supermarket ABC US Fixed Income” and “Supermarket ABC European Fixed Income”.

Now that we know the difference between a portfolio, a fund, a mandate and an account, we are ready to design the portfolio dimension. But one more thing: some asset manager called a portfolio as a “product”, because that’s what it is known as in the sales and marketing department. In my opinion this is incorrect because a product can consists of several portfolios.


  • Dimension Name: Dim Portfolio (Dim Account, Dim Fund, Dim Product are not correct names, due to the reasons I described above)
  • Dimension Type: type 2, particularly on portfolio manager, status, valuation point, distribution date, depository because we need to record the previous values. Not on inception date, portfolio name, description, etc – they are type 1 attributes because the changes means correction.
  • Grain: 1 row for each version of the portfolio.
  • Business Key: Portfolio Code
  • Surrogate Key: Portfolio Key
  • Attributes: Portfolio Name, Portfolio Description, ISIN, Client Code (not Client Key that’s snowflaking), Legal Structure (ICVC, OEIC, Unit Trust, etc.), Portfolio Type (Fund, Sub Fund, Segregated or Shell/Composite) Main Asset Class (Equity, Debt or both), Distribution Type (Accumulation or Income), Parent Portfolio Code, Main Share Class, Base Currency, Portfolio Manager (or Fund Manager), Status (Active, Soft Close or Closed), Inception Date, Closed Date, Distribution Date, Depositary Bank, Valuation Point,
  • Attributes which should not be included: number of holdings, fund size, sharpe, alpha, beta, yield because they are date dependent so they should be stored in fact tables.


C.5. Benchmark Dimension

C.6. Country Dimension

C.7. Currency Dimension

C.8. Rating Dimension

C.9. Industry Dimension

C.10. Asset Class Dimension

C.11. Performance related dimensions

C.12. Date Dimension

C.13. Broker Dimension


Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a free website or blog at

%d bloggers like this: