In the dialog following a blog (BE vs any-other-BRE-with-JMS), a number of high-level points were discussed. The TIBCO responses are reproduced below for TIBCommunity viewing...
Original points made by Alan:
-. BE takes a co-operative agent approach (for co-operating components and event processing services)
-. BE supports high scalability (for parallelizing applications and eXtreme Transaction Processing)
Comparison with other BREs:
-. Infrastructure for co-operating agents using other BREs needs to be programmed manually (eg shared blackboard). Advantage BE.
- High performance and scalability is offered usually via the app server used by other BREs, and requires additional programming eg for load balancing. Advantage BE (where its automatic by default).
Response to other BRE capabilities
- Note that most BRE's could invoke another rule service (but that is far from sufficient to support an agent-based approach). The point is that the shared distributed memory model (shared Working Memory) is out-of-the-box in BE.
- Other BRE suppliers "both exploring" and "will integrate" implies that this is future for other BREs, but currently included out of the box with BE. Note that it is quite possible that a conventional BRE customer has done a custom integration with a high performance cache or data grid technology (like you thought of doing). Check out their Wall St customers...
Shared Memory
Shared vs Partitioned (working) memory: actually I believe you need both (so the data grid approach used by BE allows for this). For CEP, where scalability to load (/low latency) is required, controlled shared and non-shared distributed memory is a required. The sort of controlled distributed reasoning we tend to need is multiple disparate agents sharing a blackboard model.
BE Shared Memory: just a note to confirm that although BE (>v2) ships with Coherance to handle shared memory, it is architected to potentially support other types of data grid too.
Such data grids are an increasingly interesting alternative to expensive centralized database transactions, especially for CEP and knowledge-based applications.
Collaborative Agents
BE supports collaborative agents (either via event passing and/or shared memory / data grid).
For large data sets one would use different agents with different views of the blackboard / shared memory area (i.e. with access to different classes) and apply apply layered reasoning (different depths of knowledge / rulesets) to the events and event histories. The idea is to try and avoid the problem you tried to solve - i.e. here is a very large set of data to apply to a very large set of rules, all in one go, with a mechanism that allows for real-time complex event processing alongside more knowledge-rich, slower rule processing, to collaborate on solution. "Divide and Conquer" etc etc.
Note also that with CEP, I only want to process an event once, but I might want to process it against all other available data. The same rules are processing changed data (i.e. typically the new event, and any assertions / changes from the other processing agents). So there is no "waste of CPU cycles", and low latency is preserved with a small overhead of synchronizing data updates across WMs (as of course there will always be some overhead)...
Types of Agents by Memory and Process
There are 4 use cases that BE allows:
A - shared memory, shared rules = replication for throughput
B - different memory, shared rules = data partitioning to handle data volume
C - shared memory, different rules = co-operative agents working against the same information
D - different memory, different rules = independent agent.
Most BE users today are probably doing A (remember most TIBCO customers are large enterprises). The power of this architecture is in the combinations of A-D that you can use in the same distributed application. The design approach is to use C to handle application complexity, and D in a different agent set to provide processing of large volumes of information, and then A for necessary performance / failover support as required for B and C agents, with D probably providing internal BAM monitoring / checklisting.
Note that the event-driven approach also provides a slight paradigm shift over conventional rule engine usage, which assists with understanding the needs for partitioning...
Comparison of Rule Distribution Approaches
"If a rule engine can use index joins to pattern match, why send all the extra data? "
Remember for event processing, the "extra data" is an event at a time plus any inferences, so this could well be less than re-sending index information around.
"Replicating the full working memory is ideal for fault tolerance, but suboptimal for increasing pattern matching performance and throughput."
Of course, application reliability and uptime is as important to our customers as performance. Especially resilience in distributed systems.
"Replicating the full working memory is ideal for fault tolerance, but suboptimal for increasing pattern matching performance and throughput."
Ah - but the point is that the WM is partitioned too! As said earlier, performance will depend on the use case. We need to get PRR or RIF out of the door so we can have some decent benchmarks to prove this stuff in action (across different frameworks).
"replicating just the beta node indexes is much more efficient than replicating all facts"
Which is certainly true if you are replicating all facts on every (rule execution) cycle, as the Rete index represents a "compiled form" of the conditions dependent on those facts. But in BE ('s Rule Engine) you are:
(a) not replicating the entire WM on every cycle
(b) only replicating certain WM "as needed" (and even then, note that "notification of change" <> "replicate WM fact", and that not all WM will need to be shared anyway)
(c) due to the event-at-a-time nature of continuous, complex event processing, we are likely to be replicating small numbers of facts in any cycle anyway.
So any assumption that the
Performance(replication(index)) << Performance(replication(WM delta))
is, at best, very use case and implementation dependent - and in the case of CEP, where the WM delta for shared facts may be very small, this might not be the overhead some might assume...
"the engine first looks in working memory ..."
Since BE2.2, the strategy per class is set by the developer, for in-memory only, cache+memory (default), and cache-only (i.e. explicit retrieval) modes. This is for fine-tuning and performance reasons, obviously.
"data is divided into concept instances and event data"
Event objects are really just a special type of object (and with appropriate event classes). Events are characterised by their Time To Live (how long they live in the Event Processing Agent, whether rule-driven or otherwise), and what "consumes" them. The latter is important because the registration of "consumption" can be used by the middleware layer to determine a successful "delivery" of the event. If the system has a catastrophic failure before event consumption, the event can be consumed by another agent instead.
"data tends to be relatively static, so it makes sense to put it in an in-memory database"
The in-memory role is primarily for reduced latency. CEP is about comparing new events with past events, so the latter must be easily (and speedily) accessible, whilst be capable of being updated as needed with information from the new events.
"asserting the same facts in multiple rule engines will cause each engine to go through the pattern matching process"
In the general sense, yes, but remember that raw / external events are only ever delivered to a single agent (for processing, and thence assertions / updates to any shared facts).
"Does BE integrate coherence Maps for the node memories, or is it simply using coherence as an external datastore?"
The latter, as sharing Rete node information would not be useful for other Event Processing Agent types (ie non-Rete) that are contributing to the solution.