Keras Cheatsheet

What is Keras?

A high-level deep learning API built on top of TensorFlow (primary), JAX, and PyTorch (multi-backend support) is the Python-based Keras library.

Keras Architecture

  • Models
  • Layers
  • Optimizers
  • Losses
  • Metrics
  • Callbacks

Keras vs TensorFlow

FeatureKerasTensorFlow
LevelHighLow
Ease of use⭐⭐⭐⭐⭐⭐⭐⭐
CustomizationMediumHigh
ProductionYesYes
Learning CurveEasySteep

Installation & Setup

    pip install tensorflow
    
    import tensorflow as tf
    from tensorflow import keras
    from tensorflow.keras import layers
    
    # Check version
    tf.__version__

    Keras Workflow

    • Load Data
    • Preprocess
    • Build Model
    • Compile
    • Train
    • Evaluate
    • Predict
    • Save Model

    Tensors & Shapes

    RankNameExample
    0Scalar()
    1Vector(10,)
    2Matrix(32, 10)
    33D Tensor(batch, time, features)
    4Image(batch, h, w, c)

    Keras Model APIs

    • Sequential
    • Functional
    • Subclassing

    Sequential API

    • Simple stack of layers
    • Linear pipelines
    model = keras.Sequential([
        layers.Dense(128, activation='relu'),
        layers.Dense(10, activation='softmax')
    ])

    Functional API

    • Complex architectures
    • Multi-input/output
    • Residual connections
    inputs = keras.Input(shape=(784,))
    x = layers.Dense(128, activation='relu')(inputs)
    outputs = layers.Dense(10, activation='softmax')(x)
    
    model = keras.Model(inputs, outputs)

    Model Subclassing

    • Custom logic
    • Research models
    class MyModel(keras.Model):
        def __init__(self):
            super().__init__()
            self.dense1 = layers.Dense(128, activation='relu')
    
        def call(self, inputs):
            return self.dense1(inputs)

    Layers Overview

    CategoryExamples
    CoreDense, Dropout
    CNNConv2D, MaxPool
    RNNLSTM, GRU
    NLPEmbedding
    NormalizationBatchNorm

    Dense Layers

    • units
    • activation
    • use_bias
    • kernel_regularizer

    Keras Advanced Cheatsheet

    Convolution Layers

      layers.Conv2D(
          filters=32,
          kernel_size=(3,3),
          strides=1,
          padding='same',
          activation='relu'
      )
      
      # Pooling
      layers.MaxPooling2D((2,2))

      Recurrent Layers

      • LSTM: layers.LSTM(128, return_sequences=True)
      • GRU: layers.GRU(64)

      Embedding Layer

        layers.Embedding(
            input_dim=10000,
            output_dim=128
        )
        # Transforms: Word index → dense vector

        Activation Functions

        NameUse
        ReLUDefault
        SigmoidBinary classification
        SoftmaxMulticlass
        TanhRNN
        LeakyReLUAvoid dead neurons

        Loss Functions

        ProblemLoss
        RegressionMSE
        Binary ClassBinaryCrossentropy
        MulticlassCategoricalCrossentropy
        Sparse LabelsSparseCategoricalCrossentropy

        Optimizers

        OptimizerWhen
        SGDSimple
        AdamDefault
        RMSPropRNN
        AdagradSparse data

        Metrics

          metrics=['accuracy']
          # Custom metric: keras.metrics.Precision()

          Model Compilation

            model.compile(
                optimizer='adam',
                loss='categorical_crossentropy',
                metrics=['accuracy']
            )

            Training Models

              model.fit(
                  x_train,
                  y_train,
                  batch_size=32,
                  epochs=10,
                  validation_split=0.2
              )

              Evaluation & Prediction

                model.evaluate(x_test, y_test)
                model.predict(x_new)

                Callbacks

                  callbacks=[
                      keras.callbacks.EarlyStopping(patience=3),
                      keras.callbacks.ModelCheckpoint('model.h5')
                  ]

                  Regularization

                    layers.Dense(64, kernel_regularizer
                        =keras.regularizers.l2(0.01))

                    Initializers

                    NameUse
                    Glorot/XavierDefault
                    HeReLU
                    RandomNormalCustom

                    Batch Normalization

                      layers.BatchNormalization()
                      # Benefits: Faster training, Stable gradients

                      Dropout

                        layers.Dropout(0.5)
                        # Prevents overfitting

                        Data Pipelines

                          tf.data.Dataset.from_tensor_slices(data)
                              .shuffle(1000)
                              .batch(32)

                          Image Data

                            keras.preprocessing.image_dataset_from_directory(
                                'data/',
                                image_size=(224,224)
                            )

                            Text Data

                              layers.TextVectorization(max_tokens=10000)

                              Time Series

                                keras.preprocessing.timeseries_dataset_from_array()

                                Custom Loss

                                  def custom_loss(y_true, y_pred):
                                      return tf.reduce_mean((y_true - y_pred)**2)

                                  Custom Layers

                                    class MyLayer(layers.Layer):
                                        def call(self, inputs):
                                            return inputs * 2

                                    Custom Training Loop

                                      with tf.GradientTape() as tape:
                                          loss = loss_fn(y, model(x))

                                      Saving & Loading

                                        model.save('model.keras')
                                        model = keras.models.load_model('model.keras')

                                        Transfer Learning & Fine-Tuning

                                        • base_model.trainable = False
                                        • base_model.trainable = True (Lower learning rate required)

                                        Hyperparameter Tuning

                                        • KerasTuner
                                        • Grid Search
                                        • Random Search

                                        Performance Optimization

                                        • Mixed precision
                                        • XLA compilation
                                        • Batch size tuning

                                        Debugging

                                        • Check shapes
                                        • Print model.summary()
                                        • Overfit on small batch

                                        Common Errors

                                        ErrorCause
                                        Shape mismatchWrong input size
                                        NaN lossHigh LR
                                        OverfittingNo regularization

                                        Best Practices

                                        • Normalize data
                                        • Start simple
                                        • Monitor validation loss
                                        • Save best model
                                        • Use callbacks

                                        Quick Reference Table

                                        TaskAPI
                                        Build modelkeras.Sequential
                                        Compilemodel.compile
                                        Trainmodel.fit
                                        Evaluatemodel.evaluate
                                        Predictmodel.predict
                                        Savemodel.save