current position:Home>The most hardcore Java programmers in the whole network must have basic knowledge (I)

The most hardcore Java programmers in the whole network must have basic knowledge (I)

2022-02-04 16:41:52 Xiao Huang who likes to knock code

Hello , I'm Xiao Huang , A unicorn company Java Development Engineer . Thank you for meeting us in the vast sea of people , It is said that : When your talent and ability , When it's not enough to support your dreams , Please calm down and learn I hope you can study with me , Work together , Realize your dreams .

 Insert picture description here

One 、 introduction

about Java For developers , About the underlying knowledge , We usually use it as a black box , There's no need to open this black box .

But with the current development of the programmer industry , We need to open this black box , To explore the mystery .

This series of articles , Will take you to explore the mystery of the bottom black box .

Two 、 Recommended books

The principles of reading : natural , Look at it

If you go into Lushan Mountain , Without saying , Get your head down , Bend down , Instead of studying the overall context of Lushan Mountain first , Then your learning method must be extremely inefficient and painful .

The most important thing is to slow down your enthusiasm , How can my study be so bad happy ah , Why is it so boring? That , Because your learning method is wrong , Read it in general , Use it first , Use to use , You can see a lot of truth .

  • 《 code : Language hidden behind computer hardware and software 》
  • 《 Deep understanding of computer systems 》( Not recommended )
  • 《 Introduction to algorithms 》、《Java Data structures and algorithms 》、《 The finger of the sword offer》
  • 《30 Self made operating system 》
  • 《TCP/IP Detailed explanation 》 Volume one
  • Long Shu 《 Compiler principle 》

3、 ... and 、 Hardware Basics

1、CPU Production process of

CPU How it was made ?

I believe everyone will have such a question mark , Let me tell you today , be-all CPU It all comes from : sand

about CPU The production process of , Here's an article , If you are interested, please have a look :CPU How it was made

If you are not interested, look directly at the summary :

  • First step : We provide... From the sand monocrystalline silicon Crystal
  • The second step : Cut the crystal into thin sheets , obtain wafer
  • The third step : Bombard metal particles onto wafers , Proceed again electroplate
  • Step four : On a wafer Photolith , Complete the wire interconnection between different transistors
  • Step five : Quality inspection , Remove poor quality CPU

2、CPU Principle

The first problem the computer has to solve : How to represent numbers ?

The most primitive computers used light bulbs , When we calculate 0100 + 1010 when , We use 8 A light bulb , With The state of the bulb represents 0 and 1, So our first version of the computer has OK 了 .

  • ENIAC weighing 27 Tons of , Covers an area of 1800 Square feet ( About us 167.2 Square meters ). Born during World War II , It was originally used as a tool to assist artillery in calculating shell trajectory .  Insert picture description here

The first edition of the computer , There is a flaw that makes people laugh and cry .

When we do high-speed computing on computers , The light bulb flashes more frequently , It may cause damage to the bulb , It requires staff to replace the bulb in time , And that affects efficiency .

As the first computer , His appearance is enough to shock people on earth .

At present, most computers use transistors to calculate , utilize And gate Or gate Not gate Or not State to represent different calculation methods .

We often hear in our lives that ,CPU32 position 、CPU64 position , Simply speaking , The difference between them is : How many bits are read at one time (bit) The number of .

Any of our calculations , Can be obtained by logical operation , Let's look at the following logical operation :0 && 1 = 0, How did it happen ?

First , Let's look at the circuit diagram : This is a And gate Circuit diagram  Insert picture description here For this circuit ,A and B As input ,Q As the output .

for example A Input low level 、B Output high level , that Q It will output low level , Converting to binary is A Input 0、B Output 1, that Q Will be output 0, The corresponding logical operation expression is 0 && 1 = 0

Here's a story about BUG Source story : Once upon a time there was a man , When doing computer calculations , Find that the numbers are always incorrect , I haven't found the reason for looking for it for a long time , Later, it was found that the computer had a hole covered with worms (BUG) Corroded , It makes it impossible to carry out low level 、 High level switching , From now on , Our programming mistakes are called :BUG

3、 The execution process of assembly language

Let's think about , In the above circuit , It happened. 0 && 1 = 0 Such an event , How do my users know that this kind of event happened to the machine ?

We can't just take the machine apart , Look at the level conversion inside .

therefore , Here it appears assembly language , And the essence of assembly language is also as The mnemonic of machine language The emergence of

such as , Let's say to the computer , You go and calculate for me 1 + 2 This operation , The computer needs to carry out the difference between high and low levels and output the calculation results

And our assembly language :movaddsub......

We can clearly understand the current operation state of the computer at a glance

Composition diagram of computer :  Insert picture description here

Let's take a look at the whole process of computer calculation :  Insert picture description here

Here's a description Java and C The difference of language :

  • C Language : Let... Directly CPU Compile
  • Java Language : Need make JVM translate , To make CPU compile , This is what Java The key to cross platform

4、 Quantum computers

For quantum computers , At present, the world is exploring , No results yet

In our ordinary computers , A bit represents 1 perhaps 0,32 A bit , Can represent 2^32 Any number of

And our qubits , The highlight is , He can also say 1 and 0

  • One qubit :1、0
  • Two qubits :00、01、10、11
  • Three qubits :000、001、011、111......
  • Thirty two qubits : One time means 2 ^ 32 The number of

This description may not be intuitive , Let's look at an example :

Now there is a number , We know that the number range is : 1~2^32, How can we find the number quickly ?

For ordinary bits , Only one... Can be represented at a time , So we need to loop through 2^32 Time , To find the number

And for qubits , Use it directly 32 Bit operating system can complete

5、CPU Basic composition of

 Insert picture description here

  • PC(Programme Counter): The address of the current instruction of the program technician
  • Registers: register , Temporarily store CPU Calculate the data you need
  • ALU(Arithmetic & logic Unit): Arithmetic unit , Do calculations using
  • CU(Control Unit): control unit
  • MMU(memory Mangagement unit): Memory processing unit
  • Cache: cache

5.1 ALU

 Insert picture description here Previous CPU It belongs to the case of single core , In this case , There will be only one Registers, our PC Will constantly switch to point to New Threads , Store the corresponding data in Registers, For handover (context switch) for , It will seriously affect our efficiency .

current CPU It's usually a multi-core state , When you do the calculation , We will have more than two Registers, In this case , our PC There is no need to switch frequently , our ALU Switch between processing and calculation .

5.2 register

 Insert picture description here

5.3 Cache

We can see from the picture above , Our computer for the convenience of obtaining data , Added L3 cache , For different caches , The length of acquisition time is also different

For multi-core CPU Come on , As shown in the figure below :  Insert picture description here

  • L1、L2 Stored in different cores
  • L3 Stored in the same CPU in

5.3.1 Locality principle

Simply speaking , our CPU While reading the data , Press fast to read the data , Do not take a single byte , As shown in the figure below :  Insert picture description here Current CPU need X This goal is worth , Steps are as follows :

  • First step : Look in the register for , Is there any X This field
  • The second step : Go to L1、L2、L3 Cache Go looking for X This field
  • The third step : De memory 、 Disk, etc X This field
  • Step four : After finding , Will be with X At the beginning 64 Bytes Form a block
  • Step five : stay L3、L2、L1 Store this data separately in , It's convenient to get the cache next time
  • Step six : take X Write to register , Data processing 、

5.3.2 MESI Cache Agreement of conformity

We can see , For the above two nuclei L1、L2 Cache lines should be consistent , A consistent agreement is called :MESI Cache consistency protocol

CPU Every Cache line Mark four states

  • M( The modified ): The Cache line It works , The data has been modified , Not consistent with the data in memory , Data only exists in this Cache in .
  • E( Monopoly ): The Cache line It works , Data is consistent with data in memory , Data only exists in this Cache in .
  • S( share ): The Cache line It works , Data is consistent with data in memory , Data exists in a lot of Cache in .
  • I( Invalid ):Cache line It's invalid

 Insert picture description here

Intel —— Cache line

  • The larger the cache line , The higher the efficiency of local space , But read time is slow
  • The smaller the cache line , The lower the efficiency of local space , But read time is fast
  • Intel stipulated by experiment , The size of the cache row is :64 byte

Bus lock ( When the cache line cannot be loaded , You have to lock the bus )

One of the cache lock implementations , Some data that cannot be cached or data that spans multiple cache rows , The bus lock must still be used

How did we test our guess ?

We tested two programs

Limited space , If the source code is not displayed here for the time being , Interested parties can pay attention to the official account. , reply : Algorithm source code

  • The first program : The value changes in the same cache line

 Insert picture description here

  • Second procedure : The change of value is not in the same cache line

 Insert picture description here The above program verifies our conjecture , Two threads are frequently faster, and the cache in the cache is faster , Resulting in longer running time

5.3.3 Cache line alignment

For some particularly sensitive numbers , There will be highly competitive access by threads , In order to ensure that there is no false sharing , We generally do not require cache line alignment

Simply speaking , We don't want us to get X Numbers at the same time , hold Y Also get in

In us JDK7 and disruptor All take long cache line padding

public long p1, p2, p3, p4, p5, p6, p7; // cache line padding
private volatile long cursor = INITIAL_CURSOR_VALUE;
public long p8, p9, p10, p11, p12, p13, p14; //cache line padding
 Copy code 

In this case , When we cache row fetches , It will cursor Ahead long perhaps hinder long Load into cache block , avoid cursor Cache line alignment

In our JDK8 in , We can add... To this parameter @Contended( According to the bottom CPU To set , Ensure that two parameters do not share a cache line ), Need to add -XX:-RestrictContended take effect

copyright notice
author[Xiao Huang who likes to knock code],Please bring the original link to reprint, thank you.

Random recommended