Sunday, May 19, 2024

Two Cache Computer

 

Two Cache Computer

Introducing a new computer architecture. 

Current computers talk, cache coherence. The new design has cache synergy.


There are different types of software data. Computer hardware treats all data as one.

It can only be faster to treat data with two different algorithms.

Introducing a new instruction that enables two types of data processing.

Like wing flaps that optimize airflow, a Two Cache Computer has coherent cache synergy. 

The program selects the cache which determines the processing methodology.

 

The first result is revolutionary. Then step two makes one cache vanish.

Processors can then be connected solely to the memory bus.

The new computer design runs any current multitasking computer software.

The computer is fully described in the Journal of Computer Science and Technology.


Offering a $1 million dollar prize

(Must precede and is contingent upon a licensing agreement.)


Coherent Main Memory


Saturday, May 4, 2024

Software has Three Data Types

Three Data Types

 Software recognizes three types of data

But hardware can not differentiate. 

Idea 1 -  Create Data Type Allocation Instruction

 

Idea 2 - Store Data in one place

1970 - Relational Data Bases

 

Idea 3 - Serialize updates through one instruction

1973 - Conditional compare and swap (CS)

 

Combine the three ideas:

Allocation instruction enables hardware to protect update integrity in three ways.

Exclusive data is not shared and can reside in a cache.

Shared data is updated in one place and CS protects using a pointer swap or lock.

Swap data is handled with an atomic CS. (pointer swap, lock, counter)

The hardware no longer needs to ensure update integrity because the software protects with a CS.  Because data is stored in one place.

 

However:

CS was implemented in the cache and with coherence (1965 algorithm).


Solution:

Perform the CS in one place which for multi-core is main memory.


Result:

Coherence vanishes because the software provides update integrity.

Scalable processors that connect only to the bus.

These multi-core processors can either reduce the multitasking queue or 

run a dedicated process or both.

 

 Different Algorithms for the Same Problem


 

 

 

Thursday, May 2, 2024

History of Coherence

 

History of Coherence 

1965 - Multiprocessor buffer interrogation*

1970 - Relational DB

1973 - Compare and Swap (CS)

Late 1970s - Locks replaced by CS

1980s - Invalidation replaces Interrogation (Snoopy*)

2022 - Patent for CS in main memory. (CS & main memory)

2023 - Patent for Coherent Memory. (Relational & main memory)

2023 - Patent to identify memory that needs to be coherent. (1970 & 1973 & 2022)

202? - Coherence vanishes.

 

Relational main memory does not require coherence.

 It is Coherent Memory

Coherent Memory = Relational Memory = Shared Data = Relational Data

 

 * Uses an algorithm that is binomial and limits multi core to 4. 

 

 Different Algorithms for the Same Problem

 

 

Saturday, April 27, 2024

Offering a $1 Million dollar prize

 

Offering a $1 Million dollar prize for a Coherent Memory Model

$1 Million will be awarded for demonstrating the Coherent Main Memory model.

The model requires a hardware and software understanding of atomic instructions. 

 

Store data in one place!


Coherent Memory maintains data integrity without cache coherence. Cache Coherence limits the number of core processors.

Coherent Memory organizes main memory - Database managers store data in one place for performance. Computer memory should too.


Award:

$1 Million dollars contingent on and paid for by a licensing contract for any of the patents.

In addition the winner will receive all future monetary payments that are awarded to the inventor and based on one or more of the patents.


Terms and Conditions:

The inventor refers to the inventor and/or the companies owning the patents.

The patents refers to all patents, owned by the inventor and/or his companies, with a priority date prior to 4/27/2024.
The winner is the first working model received prior to the signing of the first licensing contract.
If no Step 2 model has been received prior to the signing of the first licensing contract, the first Step 1 model received will be the winner.
If the winner is also the licensee, the payments will be awarded to an unaffiliated university.
Any decision as to who should receive payments is up to the company acquiring the first license and is final.

 

A contingency contract is available.



Frank Yang
FrankYang43338@acm.org
FrankYang43338@gmail.com
 
 

 

Wednesday, April 24, 2024

Software is Coherence-Free

 

Why?

Just like you need to add memory or drives, you need to add speed.

But you can not. Why not? It is called coherence. Computer caches must talk.

However, software tasks do not talk. Infinite tasks run concurrently on a uniprocessor.

The answer to why solves a hardware issue that predates 1965. 

Since 1965 the question has been, "How do you make caches talk efficiently?"

Answer - Infinite computers do not have to talk at all. 

Coherence goes poof.


 Because...

In 1965, IBM announced a multiprocessor. It had two processors and two caches and the caches talked to synchronize. IBM announced it as quad, but a quad was never built.  The algorithm has been optimized, but not redesigned. 

In 1970 the creation of relational databases resulted in the recognition that data should be stored in one location. Store data in one place!

In 1973 the creation of the conditional compare and swap instruction enabled most software locks to be eliminated.

However work on cache synchronization, now termed cache coherence, continued with an algorithm that was conceived before 1965.

This new computer design incorporates those two software changes into the hardware, resulting in processors that do not communicate; true parallel processing. The design requires a change to the hardware implementation of the conditional compare and swap instruction.

 

 

Coherent Memory maintains data integrity without cache coherence


How

Software recognizes that some data is shared, meaning that it is R/W for other processes. The software handles this data differently. The hardware would also perform better if it handled data in two ways. Step 1 is a new allocation instruction that allocates data for either shared processing or non-shared processing. Coherence is immediately reduced because non-shared data does not require coherence. (Load balancing can be handled by cast out, but is not needed given sufficient processors.) Having sufficient processors changes everything because multitasking addresses insufficient processors.  Multitasking has a queue, impacting elapsed time.


 Impact

Multi-core processors can finally exceed the 1965 design limit of four. The new limit is infinite.

Step 1 - Reduces coherence.

Step 2 - Eliminates all remaining coherence.

 

 

History

Cache coherence and an interlock prevent core processors from being connected solely to the main memory bus.

The IBM manual linked below explains the entire issue on page 104. The 2nd and 3rd paragraphs explain buffer (cache) invalidation. The next to last paragraph explains the processor interlock. The interlock is for CS, CDS, and TS instructions. These are HSP instructions. Implementing an HSP that does not contain an interlock creates a coherence-free swap (CFS). However removing this interlock has been of no benefit because shared data in cache memory requires cache coherence.

However if shared data is not stored in cache memory, it does not require cache coherence. Then no interlock enables processors to connect directly to the main memory bus.

 

 

IBM 3033 Processor Complex April 1979

 

Chronology

Chronology of disappearance of Coherence

 

Design Notes

  • Step 1 permits an exclusive cache, which requires neither coherence nor write-through.   
  • Step 1, in conjunction with replacing the HSP with a CFS, allows the hardware to handle shared data without a cache and therefore without coherence.
  • Coherence is solely for hardware update integrity. Because of multitasking, software already protects from changes made by other tasks. Software update integrity only requires the interlock.  Eliminating the interlock can be done either by serialization or with an uninterruptible swap in memory, but it can not be implemented in the cache because that causes coherence. 
  • Eliminating both the interlock and cache coherence enables core processors to be connected solely to the main memory bus.
  • The swap could be performed by a memory processor. One would be required for each memory bank. 
  • More cost efficient is to have an instruction that allocates swap memory so only the swap memory bank would require a memory processor..
  • If swap memory is only altered with a CFS, then a processor could be dedicated to handling swap instructions. This processor would be able to keep all the swap areas in its cache.
  • Additional latency is restricted to the CFS instructions.
  • Implementing step 1 alone will reduce both parts of coherence which consists of a write-through and invalidation. The benefit is expected to be great enough that that Step 2 will not need to be modeled for licensing.
 

Coherent Memory Management

 

There are three types of main memory. 

1 - Static Common Memory

2 - Protected Shared Memory

3 - Protected  Swap Area

These are described in detail in the JCST article on CFPs. (Fig. 2)

Static Common Memory would consist of data that does not change dynamically, such as programs.

Protected Shared Memory is protected by software logic because it is updated by multiple processes. The software uses the CFS either as a lock/unlock or as a pointer swap. The software currently protects because of multitasking.

Protected Swap Areas are data areas used by an CFS in a conditional swap. An example would be a conditional swap to increment a counter.

All the above areas are protected by the software. The issue is that the current HSP does the swap in the cache and therefore requires coherence. All instructions are currently implemented in the cache. Doing the swap in main memory bypasses the cache and eliminates coherence, providing all other memory is protected by software logic.

Summary:
The software currently uses the HSP to eliminate coherence from the software. However the hardware instruction was implemented in the cache, therefore the implementation requires coherence. If the instruction bypassed the cache, there will be no cache coherence.

Protected Shared Memory is also updated in main memory. Therefore it has no coherence.



back to Technical Specifications

Different Algorithms for the Same Problem 

 

Thread Safe Computers

  The invention redefines thread safe.  For comparison, two current definitions are shown below. Due to blog incompatibility current version...