Search
Latest Links
Twitter

Entries in sap (4)

Monday
Nov212011

SAP BI OnDemand and Hana

It's been some time now since the press releases and SAP TechEd Bangalore keynote proclaiming that SAP's BI OnDemand product now runs on HANA as its underlying database. The press releases have gone out. The product is here. The BI OnDemand website has been updated with a shiny new "Powered by SAP Hana" logo.

There is only one problem. It seems that the BI OnDemand that most people can see is not actually powered by Hana.

I discovered this for myself when discussing the topic with Courtney Bjorlin, who was working on an article about the announcement. SAP confirms in the article that only the "Advanced Edition" of BI OnDemand is available on the HANA database. At SAP's TechEd in Madrid, I was able to ask around on the show floor and hallways and find out more about the situation.

How do I get BI OnDemand running on HANA?

You have to buy the "Advanced Edition" of BI OnDemand. This involves a sales process and is a hosted version of the BI OnDemand platform. It seems that it's not exactly SaaS or "OnDemand", but more on that below.

The fact that the logo at https://bi.ondemand.com says "Powered by SAP Hana" is apparently an inaccuracy. Hopefully that will be corrected soon.

What are these different "editions" of BI OnDemand?

There are three "editions" of BI OnDemand: Personal, Essential, and Advanced. Based on my discussions, it seems that Personal and Essential editions are SaaS applications hosted by SAP, while the Advanced edition is hosted by partners. All editions seem to include the same web interface as seen on bi.ondemand.com, but the Essential version includes customization and branding options as well as more storage. The Advanced edition features even more storage and customization options, plus access to a hosted version of the BusinessObjects Data Services, which can be used to manage contents of DataSets. This integration of Data Services can allow for incremental updates to DataSets, which is a key feature and is not possible in the Personal or Essential editions.

As far as I can tell, none of this is documented anywhere on SAP's standard sites. My thanks to Richard Hirsch for finding this presentation outlining some of these points (see page 17): link to PDF.

So if I have the Advanced edition, I'm now on Hana?

No, not quite.

First of all, based on discussions at TechEd Madrid, it seems that only new customers can currently get onto the Hana-based BI OnDemand platform. Apparently there are contingencies for existing customers to migrate eventually, but right now it is only for new customers.

Further complicating the issue, it seems that not all hosting partners for the Advanced edition provide HANA as the underlying platform. I was told by SAP employees on the show floor that only one partner is currently providing BI OnDemand on HANA, and that partner is only in North America. Other partners are providing the BI OnDemand on the older Microsoft SQL Server-based platform. I have yet to confirm this; it is only based on the one source, so take it with a grain of salt. But there is clearly confusion around the availability of BI OnDemand using HANA, even if you are purchasing the Advanced Edition.

If capabilities provided only by HANA are required for your implementation, be sure you are actually getting HANA when you buy the BI OnDemand Advanced Edition.

Is it Hana or HANA?

I have no idea. I did learn at TechEd that HANA (or Hana) is not an acronym, so I'm leaning towards Hana, but old habits die hard.

Ok, enough with the Q&A. What does this mean?

In my view, this means that SAP still has a lot of work to do getting its message across clearly. It is not particularly bad or good that HANA is not available for the Personal or Essential editions of BI OnDemand. These editions are limited to data set sizes that are simply too small for HANA to make much of a difference.

The greater concern here is one of communication. For any company, it is extremely important to say what you mean and mean what you say. It would have been much better if SAP had been clear about the roll-out of HANA for BI OnDemand from the beginning. As it stands now, many people will try out the Personal edition and think that they are using "Hana", but they're not.

Looking to the wider view, I worry about what this partial roll-out means for SAP's BI cloud play. The BI SaaS market is still very immature and SAP has the opportunity to play a leading role in this emerging market. However the BI OnDemand product doesn't seem to have received the sort of development attention required for this role, and the deployment options seem to be severely lacking.

Companies and departments looking to buy powerful SaaS BI capabilities are not interested in figuring out what database the product is using and the impact this has on their reporting needs. SaaS should work as defined in SLAs, and it should keep getting faster and better in a way that is non-disruptive.

After talking with some of the BI OnDemand development team in February, I know that they have a good understanding of the BI SaaS space and have some great ideas for the BI OnDemand platform. I'd love to see SAP deliver on its potential in this area and I think they have the people and the vision to do so, but we haven't seen it in the product yet.

Hopefully SAP can get both the BI OnDemand message and the platform straightened out quickly. The BI SaaS market it still extremely young and SAP could be leading the way.

Disclosure: SAP provided my travel and badge for the TechEd + Sapphire 2011 conference in Madrid.

Thursday
Sep152011

On Layers

Something I saw somewhere about cutting layers out from a software stack reminded me of something else that has been niggling me for a while. Most likely what I saw was someone talking about the Vishal Sikka keynote at SAP's TechEd conference, talking about SAP's HANA in-memory application server removing layers from the software stack.

SAP has been doing this layer-collapsing move a fair amount lately and I think this is attributable to their innovation push over the last few years. I think that's all good, though we'll see where it brings us. More on that later. But what niggles me is this idea that removing layers is always a good thing. Them there layers are actually doing something you know!

So what are layers? On the server side of a client-server (two layers right there) architecture we might be talking about database servers, application servers, and front-end rendering servers. In application the application space maybe we talk about model, view, controller architectures. Every system of any complexity has some sort of layering concept.

Why do these layers exist? What are they doing? Primarily lowering costs and allowing people to get their work done. Layers server as an abstraction, as a way to get work done within a layer without having to worry too much about what is going on outside of the layer. They are a protective bubble that allows me to do things like write this post or tweet without experiencing (increasingly) existential angst about how the bits are stored.

Yes, layers look really complex when viewed from a systems perspective, but that is usually because they are codifying an incredibly complex system into a form that we can actually comprehend. Layers force structure onto a system in order to make spaces where complexity is hidden and people can live and work.

Which brings us to the point that it's not really about what is in the layers so much as it is about the transitions between the layers - those shields that create the spaces where actual work and life gets done. Transitions between layers are interfaces of some sort. They are negotiated ways for layers to talk to each other. In programs, these are called APIs. In real life, this is called bureacracy.

APIs and bureacracy share a few things in common:

  • They have lots of rules about who you have to talk to and how you have to talk to them
  • Everyone thinks they are terribly designed
  • They're usually pretty slow compared to going "stright to the source"
  • If they didn't exist and everyone went "straight to the source" then there would be total chaos

In short, everyone hates them but nothing would get done without them.

Sometimes, layers outlive their usefulness and those who hate them overwhelm the inherent inertia of mass acceptance. When this happens in technology, it's a "technical revolution". When it happens to companies, it's a "reorganization". And when it happens to governments, it's a real "revolution.

Here's the thing: Revolutions are pretty nasty affairs.

In the political sphere, lots of people die. In the industrial sphere, people lose their jobs. And in technology people experience angst about bits.

Seriously, there have been a ton of questions about how HANA persists data. People are literally experiencing angst over how their bits are stored.

So, I know that tearing apart layers is sometimes necessary in order to get past some institutionalized road-blocks that have outlived their usefulness. Just remember that those layers you just ripped through actually were doing something useful.

Sunday
May222011

SAP's HANA and "the Overall Confusion"

I threw together a very long response to a very long question on the SCN forums, regarding SAP's HANA application and its impact on business intelligence and datawarehousing activities. The original thread is here and I'm sure it will continue to grow. But since my response was pretty thorough and contains a ton of relevant links, I thought I would reformat it and post it here as well. In order to get a good overview of the HANA situation, I strongly recommend that anyone interested check out the following blogs and articles by several people, myself included:

Some of these blogs are using out of date terminology, which is hard to avoid since SAP seems to change its product names every 6 months. But hopefully if you read them they will give you some insight into the overall situation unfolding around HANA. With regards to DW/BI and HANA, these blogs address many of those issues as well. Now, to try answering the questions:

1. Does SAP HANA replace BI?

It's worth noting that HANA is actually a bundle of a few technologies on a specific hardware platform. It includes ETL (Sybase Replication Server and BusinessObject Data Services), Database and database-level modeling tools (ICE, or whatever it's called today), and reporting interfaces (SQL, MDX, and possibly bundled BusinessObjects BI reporting tools). So, in the sense that your question is "does anything change as far as needing to do ETL, modeling, and reporting work to develop BI solutions?", then the answer is no. If you are asking about SAP's overall strategy regarding BW, then this is open to change and I think the blogs above will give you some answers. The short answer is that I see SAP supporting both the scenario of using BW as a DW toolkit (running on top of BWA or HANA) as well as the scenario of using loosely coupled tools (HANA alone, or the database of your choice with BusinessObjects tools) for the foreseeable future. At least I hope this is the case, as I think it would be a mistake to do otherwise.

2. Will SAP continue 5-10 years down the road to support "Traditional BI"?

I hope so. If you read my last blog listed above you will see that HANA actually solves none of the traditional BI problems, and addresses only a few of them. So we still need "traditional" (read "good old hard work") approaches to address these problems.

3. What does this mean for our RDBMS, meaning Oracle?

Very interesting question. For a long time, SAP has supported competitive products to Oracle offerings. In my view, this was to give SAP and its customers options other than the major database vendors, and to give itself an out in the event that contract negotiations with a major vendor went south. So in a sense, HANA can be seen as maintaining this alternative offering. Of course, SAP says HANA is more than that, and I think they are right. Analytic DBMSes have been relatively slow catching on and as SAP's business slants more and more towards BI, the fact is that the continued use of traditional RDBMSes in BI and DW contexts has done a lot of damage by making it difficult to achieve good performance. It's a lot easier to sell fast reports than slow reports :-) So that is another driver. Personally, I don't agree with SAP's rhetoric about HANA being revolutionary or changing the industry. The technologies and approaches used in the ICE are not new, as far as I have seen. As far as changing the industry from a performance or TCO perspective, I'm reserving judgement on that until SAP releases some repeatable benchmarks against competing products. I doubt that HANA will significantly outperform competitive columnar in-memory databases like Exasol and ParAccel. If you are Oracle, you have a rejuvenated, and perhaps slightly more frightening competitor. I don't think anyone really thought that MaxDB was a danger to Oracle, but HANA holds more potential as a competitor to Exadata. Licensing discussions could get interesting.

4. Is HANA going to be adopted and implemented more quickly on the ECC side than BI side first?

Everything I have seen has indicated that SAP will be driving adoption in BI/Analytic scenarios first and then in the ECC/Business Suite scenario once everyone is satisfied with the stability of the solution. Keep in mind, the first version of HANA is still in ramp-up. SAP is usually very conservative in certifying databases to run Business Suite applications.

Wednesday
Oct272010

What does SAP mean by "In-memory"?

It's been a bit more than 2 years since SAP introduced the "In Memory" marketing push, starting with Hasso Plattner's speech at Sapphire ... or was it TechEd ... my memory fails me ;-)

It has been two years and I have yet to see a good understanding emerge in the SAP community about what SAP actually means when it talks about "In Memory". I put the phrase "In Memory" into quotes, because I want to emphasize that it has a meaning entirely different from the standard English meaning of the two words "in" and "memory". This is a classic case, best summed up by a quote from one of the favorite movies of my childhood:

Vizzini: HE DIDN'T FALL? INCONCEIVABLE. Inigo Montoya: You keep using that word. I do not think it means what you think it means.

- IMDB

The only reasonably specific explanation of the "In Memory" term that I have seen from SAP is in this presentation by Thomas Zurek - on page 11.

If you want a coherent, official stance from SAP on "In Memory" and the impact of HANA on BW, I highly recommend reading and understanding this presentation. I think I can add a little more detail and ask some important questions, so here is my take:

Fact (I think...)

SAP is talking about at least 4 separate but complementary technologies when it says "In Memory":

1. Cache data in RAM

This is the easy one, and is what most people assume the phrase means. But as we will see below, this is only part of the story.

By itself, caching data in RAM is no big deal. Yes, with cheaper RAM and 64-bit servers, we can cache more data in RAM than ever before, but this doesn't give us persistence, nor does working on data in RAM guarantee a large speedup in processing for all data-structures. Often, more RAM is a very expensive way to achieve a very small performance gain.

2. Column-based storage 

Columnar storage has been around for a long time, but it was introduced to the SAP eco-system in the BWA (formerly BIA, now BAE under HANA - gotta respect the acronyms) product under the guise of "In Memory" technology. The introduction of a column-based data model for use in analytic applications was probably the single biggest performance win for BWA and followed in the footsteps of pioneering analytical databases like Sybase IQ, but it was largely ignored.

Interestingly, Sybase IQ is a disk-based database, and yet displays many of the same performance characteristics for analytical queries that BWA boasts. Further evidence that not all of BWA's magic is enabled by storing data in RAM.

3. Compression

So how do we fit all of that data in to RAM? Well, in the case of BWA the answer is that we don't - it stores a lot of data on disk and then caches as much as possible in RAM. But we can fit a lot more data into RAM if it is compressed. BWA, and HANA, implement compression algorithms to shrink data volume by up to 90% (or so we are told).

Compression and columnar storage go hand-in-hand for two reasons:

a. Column-based storage usually sorts columns by value, usually at the byte-code level. This results in similar values being close to each other, which happens to be a data layout that results in highly efficient compression using standard compression algorithms that make use of similarities in adjacent data. Wikipedia has the scoop here: http://en.wikipedia.org/wiki/Column-oriented_DBMS#Compression 

b. When queries are executed on a column-oriented store it is often possible to execute the query directly on the *compressed* data. That's right - for some types of queries on columnar-databases you don't need to decompress the data in order to retrieve the correct records. This is because knowledge of the compression scheme can be built into the query engine, so query values can be converted into their compressed equivalents. If you choose a compression scheme that maintains ordering of your keys (like Run Length Encoding), you can even do range queries on compressed data. This paper is a good discussion of some of the advantages of executing queries on compressed data: http://db.csail.mit.edu/projects/cstore/abadisigmod06.pdf

4. Move processing to the data

Lastly, the BWA and HANA systems make heavy use of the technique of moving processing closer to the data, rather than moving data to the processing. In essence, the idea is that it is very costly to move large volumes of data across a network from a database server to an application server. Instead, it is often more efficient to have the database server execute as much processing as possible and then send a smaller result set back to the application server for further processing. This processing trade-off has been known for a long time, but the move-processing-to-the-data approach was popularized relatively recently as a core principle of the Map-Reduce algorithm pioneered by Google: http://labs.google.com/papers/mapreduce.html 

This approach is especially useful when an analytical database server (which tends to have high data volumes) implements columnar-storage and parallelization with compression and heavy RAM-caching, so that it is capable of executing processing without becoming a bottle-neck.

Speculation

There are also a few technologies that I suspect SAP has rolled into HANA, but since they don't share the detailed technical architecture of the product, I don't know for sure.

1. Parallel query evaluation 

Parallel query execution (sometimes referred to as MPP, or massively-parallel-processing, which is a more generic term) involves breaking up, or sometimes duplicating, a dataset across more than one hardware node and then implementing a query execution engine that is highly aware of the data layout and is capable of splitting queries up across hardware. Often this results in more processing (because it turns one query into many, with an accompanying duplication of effort) but faster query response times (because each of the smaller sub-queries executes faster and in parallel). MPP is another concept that has been around for a long time but was popularized recently by the Map-Reduce paradigm. Several distributed DBMSes implement parallel query execution, including Vertica, Teradata, and hBase

2. Write-persistence-mechanism 

Since HANA is billed as ANSI SQL-compliant and ACID-compliant, it clearly delivers full write-persistence. What is not clear is what method is used to achieve fast and persistent writes along with a column-based data model. Does it use a write-ahead-log with recovery? Maybe a method involving a log combined with point-in-time snapshots? Some other method? Each approach has different trade-offs with regards to memory consumption and the ability to maintain performance under a sustained onslaught of write operations.

Conclusion

So, there are still a lot of questions about what exactly SAP means (or thinks it means) when it talks about "In Memory", but hopefully this helps to clarify the concept, and maybe prompt some more clarity from SAP about its technology innovations. There is no denying that BWA was and HANA will be a fairly innovative product, but for people using this technology it is important to get past the facade of an innovative black-box and understand the technologies underneath and how the approach applies to the business, data, or technical problem we are trying to solve.