Wednesday, 1 October 2014

Message from SOCD Team

Hi All,

After an overwhelming response from people who want to learn the basics of electronics and are in search of resources for learning new things in the area of SoC design, we thought that it would be a good idea to pour in a basic learning curve for all the "Electronics Enthusiasts". Motivated by the appreciative response from people, we have added another menu button which will contain links to various important lectures which can be used as a road-map for learning the basics of SoC design and related fields.


Under Video Tutorials tab above you will find blogs which will contain links to various courses that can be undertaken to learn SoC design, starting from scratch. Also this section will contains links to videos in various fields of SoC design and application including VLSI Digital design, Computer Architecture, Analog Design, Embedded systems, languages like Verilog/System Verilog, etc.

We do not guaranty that you will become instant success after going through the videos, but we do guaranty that given proper amount of time, one can become a virtuoso of the field the person chooses to work on. It will take time but the time spent will be highly rewarding and lead to a start of new learning curve on which you can build upon.

Also these lectures (if the mentioned pattern is followed) will be helpful when you join the industry of your preference (these will be helpful in your interviews as well). We found these lectures awesome and highly satisfying, hope they do the same for you. With the Semiconductor market being highly competitive, it is really difficult for a fresher to land a job in the Industry, but with proper planning and using the correct resources there are vast amount of opportunities in the field. In fact most of the companies have so many vacancies but they are not able to find the appropriate talent among a large pool of Electronic Engineers because of lacking the appropriate knowledge and basic skill set.

Putting these tutorials together(and the blogs) is our means of providing the knowledge which is relevant for any candidate which is aspiring to land a job in the Semiconductor Industry.

Hope all the effort will be really helpful for the learners !!!

Kindly like the facebook page of the blog for getting updates. The link is provided below :

                                                                       SoC-ASIC-Design Facebook Page

or you can subscribe the blog by clicking on the subscribe option in the right side of the window.

Till the next post.

Cheers !!!!

SOCD Team

Saturday, 13 September 2014

Standard Clock Gate Cell

Moving ahead from my last article on Clock gating(You can read the previous article here : Low Power Techniques - Clock Gating), this small article introduces the standard “Clock Gate Cell” This cell is available to all the designers in semiconductor industry along with the other standard cells in their libraries. It is good to recognize early that clock and power are mingled up and clocks provide a good deal of avenues for saving power. So knowing more about clocks is always a plus to person interested in designing "Power Aware Circuits".

In the last article, we saw that there was a glitch in “certain cases”, when we gated the clock. These “certain cases” led to the development of this cell. If you look more closely at the waveform given in last article (reproduced below for convenience), the glitch will always be there whenever the enable signal goes down and the clock is high (high as in logic 1) .
Fig 1. AND Gate as a Clock Gate and Problem of Glitch (Reproduced from Low Power Techniques- Clock Gating)

The above observation is simple enough to propose a solution for the glitch problem in AND gate. Fig 2. shows the clock gate cell. 
Fig 2. Clock Gate Cell


This cell stops the clock without any glitch in the output.  And how does it do that? It utilizes a LATCH for this purpose. 

-----------------------------
Latching Mechanism
-----------------------------
A latch is a level sensitive device in which the output is same as input till its Enable signal is at a particular logic level. Generally logic 1 is the chosen level for this matter. The output traces the input till the Enable signal is logic 1 and as enable is switched to logic 0, the output retains the last value it captured from the input till the next enable comes where it starts tracking the input again.

------------------------------------------------------------------
Latching Mechanism as used in Clock Gate Cell
------------------------------------------------------------------

For clock gating purpose the enable signal of latch is tied to the inverted clock signal (Clk_in) so that output changes only when the clock is at a particular logic level (which is logic 0 in our case). So it is easy to see that we can pass Enable signal and the changes will be recorded only when the logic level is low (or logic 0) and then AND this with the clock signal. This will ensure that that the Enable low is recorded only when the Clk_in is low and hence no glitch in the output.

Note that we have inverted the clock before sending it to the latch so as to ensure that the latch is sensitive to logic 0 of Clk_in (since the latch itself is sensitive to logic 1). Before I proceed further, I would like to point out that a latch is a very powerful tool available to designers and it is used in multiple circuits (it will be a hot topic in some of the coming articles). With above cell Fig 1. modifies to Fig 3. as below :

Fig 3. Timing Waveform for Clock Gate

In the above figure notice how the enable edges which happened during the clock high have been translated or rather delayed in the clock low domain.

We can use this cell to TURN OFF an entire clock domain or TURN OFF only a few flops. It is entirely on designer that how much power he wants to save and according to that we have levels of clock gating in any SoC. Also note that these Enable signals are controlled by Finite State Machines (FSMs) which trigger based on inputs from other FSMs in SoC.


A small point to all the above information is : Clock gate is a HW means to control the clock signal there are software means as well in which we program the PLL itself to control the entire clock tree. 

I conclude this article here with these lines of Dr. Thomas Fuller :
"Let not thy will roar, when thy power can but whisper"

Tuesday, 2 September 2014

Low Power Techniques : Contemporary Strategies - I

Most of the SoC designs for mobile domain(by Mobile Domain, I mean the devices being used in SmartPhones, tablets, laptops, or any gadget which can be carried from one place to another easily) require it to consume low power while performing operations at the same time.


This article discusses the various contemporary techniques (in brief) employed by the Design companies in coming up with such complex SoCs. Before we proceed any further, I would like to differentiate between two terms which are often used interchangeably in the industry these days. They are : Low Power and Power Aware

There is a difference in the two terms which I mentioned above. Be known that “Low Power is not the same as Power Aware.”

Difference between Low Power and Power Aware Techniques

Low Power (in the context of these devices) means that the device will consume low power owing to the way it is implemented. One example for this would be the use of High Threshold (High Vth) MOS transistors in the design which has lesser leakage and performs slower than the Low Vth transistors which are leaky.

Power Aware, on the other hand, implies that the SoC or device itself is intelligent enough to shut down or turn-off the units which are not required while performing a particular operation or the device knows when to operate at a lower frequency or in a low power consuming mode/state.

Contemporary SoCs are a mingled version of Power Aware as well as Low Power Techniques to have an optimum performance and power consumption at the same time. Due to excessive power consumption by these small devices, a new parameter has emerged as a metric for SoCs. It is Performance per Watt or in Industry Lingo it is Perf/Watt. Know that this number will be different for different types of tests/benchmarks we run on the design. Generally the worst case Perf/Watt should be reported.
So here I go on listing and summarizing the techniques that are being used.

I leave it to you to categorize the technique as low power or power aware. J

The first and simplest to understand is MTCMOS

1.      MTCMOS (Multi – Threshold CMOS)
Fig 1.  Types of MOS Transistors with different Vt

 As mentioned above, this technique is highly useful in reducing leakage power. Every company has its own standard cell library and a standard library is provided by “The Foundry” as well. This library (the one provided by the foundary) contains transistors with varying thresholds or different Vth’s. Since low Vth implies(these cells are termed as LVT cells; LVT standing for Low Vth) higher leakage but high speed, we tend to use them in timing critical paths (What is a timing critical path??) or in other words, the path which has the largest delay (arising from flops, combo logic) in it. For all other paths or in between logic, we use the cells with medium Vth(SVT for Small Vth) or high Vth(HVT for High Vth) which has lower performance but are highly tolerant to leakage power. In fig 1b., the solid dashed line in the gate is used to depict that it is a high threshold device. The Vth of the transistor is changed by varying the oxide (SiO2) thickness in the MOS transistor and hence the solid line for showing thick oxide layer in fig 1b. 
For details you can visit this article again :  Basics of MOS Devices 

2.      Multiple Supply Voltages

Fig 2. A SoC with Multiple Voltage Islands

Kindly note that above figure is arbitrarily drawn and bears no resemblance to any existing SoC. 
For a long time the industry has been dreaded by the dynamic power consumption which is a strong function of the operating voltage, frequency and load capacitance, given by the relation :

Pdyn = CLVDD2f
Note that in above equation I have not considered the switching factor.
As per above equation if we reduce the operating voltage by 2 by keeping all the other things constant, the dynamic power would be reduced by a factor of 4. This simple approach is used in coming up with SoCs which have different voltage islands in it. The different parts of the SoC operate on different voltages, and of course on different frequencies. As an example the audio unit in any SoC can operate at lower frequency than the display unit and hence we can use different voltages for these two types of IP’s. Such scenario is depicted in figure above.

I’ll talk about more about the remaining techniques in a follow - up article.

Till then, “It is not the end but a start of new era of revolution”.

Saturday, 30 August 2014

Low Power Techniques : Clock Gating

Disclaimer : All the articles are written with an assumption that the reader has a basic knowledge of Boolean Gates and digital elements.

In this article, the key focus of discussion would be clock gating and its impact on the design on modern day SoCs. Clocks could be considered as the Arteries (or Veins so for the matter) of the entire SoC and clock source being the heart of any SoC. All the IP’s like CPU, Audio, Display, USB, etc require a clock to function. It is the “CLOCK” which gets to decide the maximum speed of operation for any SoC (Here I am not considering the throughput increase due to the various kinds of architectures employed in designing the SoC) and hence special attention should be given to this signal.

Another reason why this signal should be given more attention is due the fact that it is the highest toggling signal in any SoC and therefore consumes the largest fraction of entire SoC power. In a typical SoC for mobile application, the clock tree consumes 30-40% of the power (Yes, this number can be more depending on the Clocking architecture employed).

Clock gating is a technique to turn the clock OFF depending on the requirement or the use case. How this is achieved is the main focus of this topic.


There are essentially two ways to gate the clock. One through hardware control and one through software control.

The software control is through registers which are used to control the PLL or DLL or any other clock source that we may have. We turn the PLL/DLL/Source off (which is in itself at times a complex process) by following a specific sequence. Turning of the source is a method to gate the trunk of the clock tree (I’ll talk about the Industry lingo on clocks in an upcoming article. Kindly bear with us on terms like trunk, branch, etc).
It may happen that in some of the cases we want to shut only a branch of the trunk. This is the point where hardware control comes in picture.

This article (and some subsequent articles) will discuss on hardware control of the clock.

Below is shown a free running clock :



Fig 1. Free Running Clock
---------------------------------------------------------------------------------------
How can we actually stop it based on some control signal say “enable”?
---------------------------------------------------------------------------------------

Yes, you got it right, we need to AND the two signals and put the enable to be at “1” when we need the clock and “0” when we don’t want the clock output for any IP.
The figure below shows a simple AND gate being used as a gate.


                                       Fig 2. AND Gate as a Clock Gate and Problem of Glitch

So do we actually use this “AND” gate in our SoCs to gate the clock from reaching to any IP?

You guessed it right. No, this is not the standard cell which any modern day SoC use. The reason is clear from above figure itself. If the enable signal changes while the clock is high, this will lead to a runt pulse termed as “Glitch”. A glitch is harmful for any SoC as it can lead to unknown states or the states which we don’t want our system to be in. For now, I’ll leave the glitch to a-not-so-far-in future article.

I’ll discuss about “Clock Gating Cell” in the next article.


Till then, Enjoy!!! 

SESSION 3: Reading from FG-MOSFETs: Part- 1

Hello everyone. In previous sessions, we have discussed the method to program and erase an FG-MOSFET. 

And programmed state corresponds to binary 0, whereas erased state corresponds to binary 1.

Now the question here lies: how will you get to know whether your FG-MOSFET has a binary '0' or a binary '1' stored in it? In other words, how will you read the data present in FG-MOSFET?

------------------------------------------------------------
The answer lies in the threshold voltage.
------------------------------------------------------------

Now suppose I connect an FG-MOSFET to two batteries, one VGS between gate and source, and the other VDS between drain and source. See the figure below.


So the voltage VGS provides an electric field in the direction shown by the red arrows in the figure below:

Because of this electric field, the holes at the interface of p-type semiconductor and oxide layer are repelled away. The electrons appear at this interface from bulk of p-type semiconductor, thus forming an n-type channel between n-type source and n-type drain. Since this n-type channel is formed from p-type semiconductor, it's called inversion channel.

What role does VDS play here? It provides the field from Drain to Source, thus pulling electrons from Source to Drain via the inversion channel formed by applying VGS

You may visualize the above process in the video below. The small red circles represent the holes in p-type semiconductor and the small green circles represent the electrons. Note the inversion channel formation in the video.


In the above video, the flow of electrons indicates the flow of current in external circuit. 
Now, what if the voltage VGS  was not sufficient enough to create the inversion channel? Then obviously, current will not be able to flow.So for every FG-MOSFET, there is a particular value of VGS, only above which the inversion channel is formed and conduction takes place. This value is called threshold voltage, VT.

So, the conclusion till now is: 

The following graph represents the above relation between current IDS and VGS:


As we might intuitively conclude that increasing VDS shall also increase the current IDS (the greater the VDS, greater the pull on electrons via the channel), but does it go on forever?
Reason: Channel pinch-off

Let us see what is channel pinch-off and why does it occur in the next part of this session.
Eventually you will see these concepts clubbing together to build up the concept of reading from an FG-MOSFET.


Till then : "Thresholds are not the ends! Something definitely lies beyond them"




Sunday, 17 August 2014

The Myth of Area Downsizing in SoCs

Why do we fuss so much about optimizing the area?

Why do we want to optimize the design and learn various design techniques which help in reducing the area to obtain the same functionality?

This is the topic of discussion for this article.

Any chip is fabricated by following a series of steps like lithography, oxidation, etching, metal disposition, ion implantation, etc. These steps are performed on a silicon wafer (which is generally circular owing to the processes that we use to obtain pure silicon which leads to long cylinder of wafers (called Silicon Ingots) which is then sliced to obtain the silicon wafer). A single wafer can contain many dies (which we package and obtain our chips from).
A diagram is shown below which shows the wafer (sliced from the cylinder) and the dies in the wafer:
Fig 1. Silicon Wafer and Ingot,  1a. A Wafer depicting dies ,
1b. A cylindrical silicon ingot which is sliced to get various wafers
Fig 2. Silicon wafers with different die size and defects

Note that it is these dies which we package and obtain our chips from. You can view the actual Silicon Ingot at this link: Making of Silicon Ingots

From figure 1 and figure 2 it is clear that if we want more chips from a wafer we need to have a small die size or in other words we can say that,
                                         “Yield has an inverse relationship with die size”.
The more the die size, lesser is the yield for the wafer with same sizes.
Another thing to note is that if we keep the die size to be constant then in order to increase the yield we will have to increase the wafer size.

So do we increase the wafer size to obtain higher yield if we can’t reduce the area further?
The answer is partly yes.

We cannot increase the wafer size indefinitely as it gets unstable after certain size. Currently the semiconductor industry has been able to hit wafer sizes 6 inch, 12 inch and 18 inch. Plenty of research is going on for increasing the wafer size.
Another thing which I have not mentioned above in regards to yield is that having a lesser area has other perks as well. The above mentioned processes like lithography, ion implantation, etc are not perfect and they introduce defects in the chip leading to faulty chips and thereby lesser yield.

How does having a lesser area helps?

It is said a picture is worth thousand words. So considering Fig 2., it is obvious that having a lesser die area will definitely contribute to higher yield considering that the process introduced defects(In Fig 2. above, the red dots corresponds to defective dies) are similar. If we calculate the yields from Fig 2. , it is clear that in 1st wafer the yield would be 2/4 = 0.5 or 50% (I have ignored the area which is not used in any die, the blue part, just so to be abstract) and in 2nd wafer it would be 22/28 = 0.78 or 78%.
Clearly the yield is greater in 2nd wafer.

So the next time you hear people trying to reduce the area, you will definitely have some idea on why the area reduction is important for any SoC or ASIC.

Till then "Try to learn something about everything and everything about something - Thomas Huxley".

Feedback and comments are welcome.

Monday, 11 August 2014

SESSION 2: Tunneling for Programming and Erasing FG-MOSFETs

As discussed in the last session, we shall see how tunneling is responsible for programming and erasing FG-MOSFETs.

We discussed in the previous session about the 2 states of the FG-MOSFET:

  1. Binary 0 : Programmed => Electrons present in Floating Gate
  2. Binary 1 : Erased => Electrons removed from Floating Gate


Following is the diagram of FG-MOSFET which we analyzed in the last session:




When we program the device (20 V at Control Gate and 0 V at Substrate), because of the downward electric field, the electrons are tunneled upwards through the Lower Oxide Layer into the Floating Gate. In the figure below, the black arrows downward denote the applied electric field through voltage on Control Gate. The read arrows upward show the direction of  movement of bulk electrons from the p-type substrate.



So what actually happens at the p-type substrate -TO- oxide - TO- Floating Gate interfaces which causes electrons to even cross the oxide layer in between?

 I hope you are familiar with the energy diagrams. Below you can see the energy diagram of the above mentioned interface at equilibrium.


The following notations are used in the diagram:











The following layers of the FG-MOSFET are shown in the diagram:
  1. n+ type Control Gate : made of n+ type Polysilicon
  2. Upper Oxide Layer: made of silicon dioxide or oxide-nitride-oxide (ONO)
  3. n+ type Floating Gate : again made of n+ type Polysilicon
  4. Lower Oxide Layer: made of silicon dioxide
  5. p type Semiconductor
So, when a high voltage, say 20 V is applied to the Control Gate, the energy bands transform as shown in the figure below:



Now, here comes trick! I guess you must have noticed the very less thickness of Lower Oxide Layer as compared to Upper Oxide Layer. The high voltage (20 V) causes a huge drop of Fermi Level of the Floating Gate, as a result of which, the potential barrier width that an electron has to cross reduces (Note the triangular barrier developed adjoining the conduction band of p-type semiconductor). On the other hand, there is not much bending of barrier in the Upper Oxide Layer because of its thickness. So the thin potential barrier in the Lower Oxide Layer provides a path for electrons to tunnel through it. And consequently, the electrons get trapped in the Floating Gate.

This tunneling in FG-MOSFETs is called Fowler-Nordheim Tunneling (F-N Tunneling).

After going through lot of mathematical manipulations and assumptions, we obtain the following relation for tunneling probability of an electron for F-N Tunneling:


The probability equation above is a negative exponential curve [ f(x) = exp(-x) ] like below (shown for x > 0):


So, the probability that a particle can tunnel through the potential barrier increases as x decreases and gets closer to x = 0.
In our tunneling equation, the probability will hence increase if the whole factor

reduces to a very small value. This is achieved by applying a high voltage (20 V) and hence a high electric field E. Hence, by applying this high electric field, we actually increase the probability of tunneling of an electron into the Floating Gate, hence causing the program operation to happen.
Another conclusion : a small voltage ideally cannot disturb the electrons inside the Floating Gate as a small electric field cannot increase the probability of tunneling of an electron. 



I guess you have a brief idea now as to how the tunneling causes the program operation in FG-MOSFETs to happen. It is easy to deduce the band diagrams in erase operation: the bending of oxide layers happens just the reverse and electron crosses from Floating Gate to the Semiconductor.

Now, how will you detect whether there are electrons or not in the Floating Gate? In other words, how  will you read whether the FG-MOSFET contains a binary 0 (programmed)  or binary 1 (erased)? 

So, let us discuss the Read operation in FG-MOSFET in the next session, now that we have completed Program and Erase operations.




Till then, "Anyone who is not shocked by the quantum theory has not understood it"   ---- Niels Bohr.


Sunday, 10 August 2014

Quiz #1 Switching Activity Calculation for N input gates

This is one of the “many to follow” quizzes and it is taken from Digital Integrated Circuits by Jan M. Rabaey, et al. (One of the best books in the world for learning about the Digital Integrated Circuits). Soon we will have real life circuits(Which are heavily used in the semiconductor industry) to analyze in the quiz section. 

Switching Activity
 We all know that the dynamic power dissipation in is given as :
P(dynamic) = α0->1CLVDD2f
Here the factor α0->1 is termed as switching activity (or the transition activity). The transition activity is a strong function of logic function (the logic operation being performed by any of the digital gates like AND, OR, NAND, etc). For gates implemented using static CMOS technology, this factor is a multiplication of the two probabilities (The subscript 0->1 is there as for static CMOS gates  power is only consumed when the output switches from 0 to 1 and not other way round as we have a direct path from supply to the output when output transitions from 0 to 1. More to follow on this in the device section):

P0: The probability that the output will be “0” in present cycle (or for the matter in any cycle)
P1: The probability that the output will be “1” in the next cycle (or for the matter subsequent cycle of the cycle under consideration).

So α0->1 = P0*P1 or in other words
α0->1 = P0*(1-P0)

If we assume that the inputs to the N input gates are not related to each other (which is a practical consideration) and are distributed uniformly over time, then the switching activity is given as :

α0->1 = (N0/2N)* (N1/2N)= N0(2N – N0)/22N

N0: It is the number of “0” entries in the output column of truth table for that gate
N1: It is the number of “1” entries in the output column of the truth table for the same gate

Now for the problem; suppose we have N input XOR, NOR and NAND gates. Assuming the above assumptions to be valid what should be the switching activity for the above gates? What will be the probability if we replace the N input gates with an inverter?

                                                                   Fig 1. N input NAND Gate
The answers to above questions are simple and easier to find with the above mentioned details.

Feel free to comment and discuss on the same.

Comments and Feedback are welcome.

Resets - II

Continuing from our last article, Resets - I , this article discusses the synchronous resets mentioned earlier and evaluates its pros and cons.

Before continuing forward, a basic knowledge of the "Flip Flop" is assumed.

Synchronous Resets

Synchronous is derived from two terms “syn” meaning “same” + “chronos” meaning “time” and how do we denote time in our systems? The answer is Clocks. So going by the name, this reset occurs when the reset is HIGH/LOW (depending on the type of reset) at a rising/falling clock edge (again depending on whether the flop is positive edge triggered or negative edge triggered). If the reset goes high (Assuming reset=1 put Q=0) in between any time t, t+T (where T is the time period of the clock) and comes back to “0”, the reset will not be registered. Or in other words there is no difference between the D signal (Assuming a D type flop which is the industry standard) and the reset signal as both of the signals are sampled at the positive/negative edge of the clock.
Consider the diagrams below and it will make more sense:


Fig 1. D Flip Flop



                                              Fig 2. Timing Diagram for D Flip Flop in fig1.
Note that in above diagram we have assumed a Zero delay model for each of the signals (Which is a deviation from the real world as it has delays, I will come back with more on this)
From above diagram it is easier to see that No matter what the value of D_flop (which is the D input of the flop ) while the sync_reset is high, the Q_flop will be "0". Once the reset has been removed, Q_flop follows the D_flop.

The plus points of the synchronous reset are clearly visible:
1.      It helps in glitch filtering (If they do not occur near the clock edge then the glitch will not cause any harm to the circuit).
2.      Also the (not so visible point) is that the flop circuit is simple and hence consumes less area.
3.      The resulting system is completely synchronous

The negative point lies in the plus point themselves:

1.      It requires a clock in order to be sampled.
2.    The applied reset will have to strictly adhere to the requirements of setup and hold times (provided in the “SPEC” sheet of the Flop) so that there are no timing issues.
3.     One big problem which arises while using this kind of reset (because of the fact that they are similar to data signals) is the synthesis. The synthesis tool may not be able to differentiate between data and the reset as both are being sampled on the clock edge. It becomes necessary then to tell the tool to add certain “directives” which tell it that the input to “S or R” pin of the flop is a “set or reset” signal.
4.   Also it may happen that the reset signal becomes the fastest timing path and then would need a timing closure  as it would be the critical path, which we generally don’t want. 

Synchronous designs have been the preferred design for designers for the last few decades but with increasing 
complexity newer approaches are being adopted by the industry (GALS being one of them which I mentioned in
an earlier post).

Hope you liked the article.

Your feedback and comments are welcome.

Upcoming Post : Reset - III





Monday, 4 August 2014

SESSION 1: Basic Storage Unit in Flash Memories

Most of us use flash memories today in n-number of devices, for example, SSDs, eMMC, SD cards, USB pen-drives, etc. The list is expanding day by day as we need a substitute for the mechanical and heat-plus-noise producing Hard Disk Drives (HDDs). 

I would like to specify that this article assumes a coneptual knowledge of MOSFETs and their working. If you are not familiar with it you can visit this article : Basics of MOS Devices

In this first session of Flash Memories, let us start from the unit cell of storage, that is, the particular electronic component that stores a bit of data. And that component is Floating Gate MOSFETs (FG-MOSFETs). Though I think I need not expand MOSFETs (Metal-Oxide-Semiconductor Field Effect Transistors), yet it would prove beneficial later!

So, this is what a traditional MOSFET looks like:


n-type MOSFET

The MOSFET has a METAL contact attached to the conducting polysilicon layer. Below the polysilicon layer is the insulating OXIDE layer followed by the SEMICONDUCTOR.

And this is how our FG-MOSFET looks like:
n-type FG-MOSFET


As clearly distinguishable from the image above, the FG-MOSFET has an additional oxide layer and a Floating Gate sandwiched between the two oxide layers. If somehow, a couple of electrons get trapped in this floating gate, ideally they won't be able to leak out even if the power to this device is turned off, thanks to the oxide layers on both sides. This electron trapping in the floating gate forms the basic concept of non-volatile storage using FG-MOSFETs.

So, for the very basic understanding, FG-MOSFETs can have one of the following two states:

1. Electrons are trapped in Floating Gate       : Programmed State, equivalent to BINARY STATE '0'.
2. Electrons are not present in Floating Gate  : Erased State, equivalent to BINARY STATE '1'.

Hence, if you are programming an FG-MOSFET, you are basically pumping electrons into the floating gate. And if you are erasing an FG-MOSFET, you are pulling out the electrons from the floating gate, obviously if the electrons are present. 

But the question now arises: How can you pump electrons into the floating gate with the insulating oxide layers surrounding it? Similarly how can you erase/remove the electrons from the floating gate? 

The answer lies in the keyword : tunneling.

Hence, if we apply a huge voltage (say 20 V) on the control gate and 0V (GND) at the substrate, the electrons present in the bulk in p-type semiconductor will tunnel into the oxide and get trapped in the floating gate. See the video below: 



Similarly, if we want to erase the device, we just reverse the voltages, that is, 20 V at substrate and GND on Control Gate. See the video below:



I guess the term tunneling must be familiar with you, but let us discuss this concept in a bit more detail in the next section. After that, we can be clear about program and erase in an FG-MOSFET!

Till then, "be constantly amazed with electrons!"