Please navigate to the bottom of the page for Table of Contents

Friday, April 19, 2013

LINQ design interview question

LINQ is one of my favorite interview topics for many reasons. It allows me to check a candidate's ability to define logic, use SQL skills, showcase lambda expressions, distill the problem into easy steps and of course, see some code in action. Also, since LINQ code is generally quite compact, it is perfect for white boarding.

I was recently helping a friend solve a problem and felt that it could be modified to an excellent interview question. It has a bit of algorithm design, heavy use of simple to medium complex LINQ and is not more that 10-15 lines of actual code. Perfect.

Question: Given a class of students  (ID, Name, Grades) with results for at least 2 semesters, find the students whose Grades have significantly improved in the second semester as compared to the first. Bonus: List out top 25% of the students.

As the first step, let us define the rough algorithm for solving this problem:

  1. Load Student data for first semester
  2. Load student data for second semester
  3. Merge Grades for each student from second semester into the first semester result set
  4. For each student, calculate the grade difference for students whose grades have improved (you do not want to waste computational cycles for students whose grades have fallen)
  5. For each student from 4 above, calculate the grade improvement percent
  6. Sort the result by descending order of grade improvement percent
  7. Select rows that constitute top 25% of students

Now, let’s define our POCO class that will represent the student data structure. The data structure has two parts: basic student properties (Id, Name, Grade) and some computational properties that we will use to solve our algorithm.

    public class StudentRecord
{
// basic properties
public int Id { get; set; }
public string Name { get; set; }
public double Grade { get; set; }

// computational properties
public double GradeSecondSemester { get; set; }
public double GradeDifference { get; set; }
public double GradeDifferencePercent { get; set; }
}

In addition to a Student record, we also need to a collection to hold these records. Let’s call it ClassReport:

    public class ClassReport
{
// some basic properties of a class
public int Id { get; set; }
public string Name { get; set; }
public int SemesterId { get; set; }

// list of student records for that semester
public List<StudentRecord> Students;

public ClassReport()
{
Students =
new List<StudentRecord>();
}
}

For steps 1 and 2 of the algorithm, I am not going to focus on this blog. We will assume that we have a way to load this data (stubbed in our code sample below).


Step 1 and 2: Load Student data:


We will mock up our student data in this example. In real life, this data probably will come from a database or a web service call, etc. As you can see below, we generate some dummy student data for two different semesters. Note that the student Id’s need to match across (not really a hard requirement, but good to have for our demo purposes).

        static void Main(string[] args)
{
// Step 1 and 2: Load Student data
ClassReport cr1 = MockSem1();
ClassReport cr2 = MockSem2();

}

private static ClassReport MockSem1()
{
ClassReport cr = new ClassReport()
{
Id = 1,
Name =
"First Grade",
SemesterId = 1
};

// add dummy data - in real life, this comes from database
cr.Students.Add(
new StudentRecord()
{
Id = 1,
Name =
"Ray",
Grade = 90
});
cr.Students.Add(
new StudentRecord()
{
Id = 2,
Name =
"Jack",
Grade = 80
});
cr.Students.Add(
new StudentRecord()
{
Id = 3,
Name =
"John",
Grade = 60
});
cr.Students.Add(
new StudentRecord()
{
Id = 4,
Name =
"Lisa",
Grade = 67
});
cr.Students.Add(
new StudentRecord()
{
Id = 5,
Name =
"Jill",
Grade = 94
});
return cr;
}

private static ClassReport MockSem2()
{
ClassReport cr = new ClassReport()
{
Id = 1,
Name =
"First Grade",
SemesterId = 2
};

// add dummy data - in real life, this comes from database
cr.Students.Add(
new StudentRecord()
{
Id = 1,
Name =
"Ray",
Grade = 85
});
cr.Students.Add(
new StudentRecord()
{
Id = 2,
Name =
"Jack",
Grade = 95
});
cr.Students.Add(
new StudentRecord()
{
Id = 3,
Name =
"John",
Grade = 82
});
cr.Students.Add(
new StudentRecord()
{
Id = 4,
Name =
"Lisa",
Grade = 95
});
cr.Students.Add(
new StudentRecord()
{
Id = 5,
Name =
"Jill",
Grade = 96
});
return cr;
}

Step 3: Merge Grades for each student from second semester into the first semester result set


Here is where our LINQ fun starts. In one of my previous post, I had explained how LINQ join works. Now would be a good time to review bunch of LINQ related interview questions and answers from the past.

            // step 3: Merge Grades for each student from second 
// semester into the first semester result set

// in-line merge; no need for a separate assignment
// for each student in first semester report
(from firstSem in cr1.Students
// join to second sem class report
join secondSem in cr2.Students
// on student id as the key
on firstSem.Id equals secondSem.Id
// assign second sem grade to first (via a select; clever)
select firstSem.GradeSecondSemester = secondSem.Grade)
// execute
.ToList();
As you can see from the LINQ statement, we join first semester class report with second semester and then do an in-line update of the first semester data structure’s GradeSecondSemester property.

Step 4: Calculate grade difference for students whose grades have improved


Now we need to process those students whose grades improved in the 2nd semester and update our data structure to reflect the difference:

// step 4: For each student, calculate the grade difference 
// for students whose grades have improved (you do not want
// to waste computational cycles for students whose grades have fallen)
cr1.Students
// find all students who did better in 2nd semester
.FindAll(s => ((s.GradeSecondSemester - s.Grade) > 0))
// and update their grade difference property
.ForEach(x => x.GradeDifference = (x.GradeSecondSemester - x.Grade));

We use FindAll to find those students who have better grades and then loop for those records using ForEach to update the GradeDifference property.


Step 5: For each student from 4 above, calculate the grade improvement percent

// Step 5: For each student from 4 above, 
// calculate the grade improvement percent

// 5.a. calculate total difference sum
double gradeDiffSum = cr1.Students.Sum(x => x.GradeDifference);

// 5.b. calculate percent for each row
cr1.Students
// find all students who did better in 2nd semester
.FindAll(s => ((s.GradeSecondSemester - s.Grade) > 0))
// update grade percent
.ForEach(x => x.GradeDifferencePercent =
((x.GradeDifference * 100) / gradeDiffSum));

Step 6: Sort the result by descending order of grade improvement percent


By this point, I am hoping you get the gist of how to approach such LINQ queries. Sorting a list is a relatively simple operation as shown below:

    // Step 6: Sort the result by descending order of grade improvement percent
var sortedList =
(
from entry in cr1.Students
orderby entry.GradeDifferencePercent descending
select
entry)
.ToList();
Step 7: Select rows that constitute top 25% of students

We use TakeWhile here to keep selecting students till we reach our number.

    //Step 7: Select rows that constitute top 25% of students
// 7.a. - sum of grade diff percent
double gradeDiffPercentSum = 0;

// 7.b - take as many rows as needed till we reach the 25% sum
var bestStudents =
sortedList
.TakeWhile(x =>
((gradeDiffPercentSum += x.GradeDifferencePercent) <= 25.0))
.ToList();

// 7.c account for the case where the first element is more than 25
if (bestStudents.Count() == 0)
bestStudents = sortedList.Take(1).ToList();
That’s it. You now have a list of top 25% students whose grades have improved in the second semester.

I am pasting the complete program here in case you like to see the whole thing in one shot.


The complete solution:

using System.Linq;

namespace
StudentReport
{
class
Program
{
static void Main(string
[] args)
{
// Step 1 and 2: Load Student data
ClassReport
cr1 = MockSem1();
ClassReport
cr2 = MockSem2();

// step 3: Merge Grades for each student from second
// semester into the first semester result set

// in-line merge; no need for a separate assignment
// for each student in first sem report
(from firstSem in
cr1.Students
// join to second sem class report
join secondSem in
cr2.Students
// on student id as the key
on firstSem.Id equals
secondSem.Id
// assign second sem grade to first (via a select; clever)
select
firstSem.GradeSecondSemester = secondSem.Grade)
// execute
.ToList();

// step 4: For each student, calculate the grade difference
// for students whose grades have improved (you do not want
// to waste computational cycles for students whose grades have fallen)
cr1.Students
// find all students who did better in 2nd semester
.FindAll(s => ((s.GradeSecondSemester - s.Grade) > 0))
// and update their grade difference property
.ForEach(x => x.GradeDifference = (x.GradeSecondSemester - x.Grade));

// Step 5: For each student from 4 above,
// calculate the grade improvement percent

// 5.a. calculate total difference sum
double
gradeDiffSum = cr1.Students.Sum(x => x.GradeDifference);

// 5.b. calculate percent for each row
cr1.Students
// find all students who did better in 2nd semester
.FindAll(s => ((s.GradeSecondSemester - s.Grade) > 0))
// update grade percent
.ForEach(x => x.GradeDifferencePercent =
((x.GradeDifference * 100) / gradeDiffSum));


// Step 6: Sort the result by descending order of grade improvement percent
var
sortedList =
(
from entry in
cr1.Students
orderby entry.GradeDifferencePercent
descending
select
entry)
.ToList();


//Step 7: Select rows that constitute top 25% of students
// 7.a. - sum of grade diff percent
double
gradeDiffPercentSum = 0;

// 7.b - take as many rows as needed till we reach the 25% sum
var
bestStudents =
sortedList
.TakeWhile(x =>
((gradeDiffPercentSum += x.GradeDifferencePercent) <= 25.0))
.ToList();

// 7.c account for the case where the first element is more than 25
if
(bestStudents.Count() == 0)
bestStudents = sortedList.Take(1).ToList();

// do something with this result set
}

private static ClassReport
MockSem1()
{
ClassReport cr = new ClassReport
()
{
Id = 1,
Name =
"First Grade"
,
SemesterId = 1
};

// add dummy data - in real life, this comes from database
cr.Students.Add(
new StudentRecord
()
{
Id = 1,
Name =
"Ray"
,
Grade = 90
});
cr.Students.Add(
new StudentRecord
()
{
Id = 2,
Name =
"Jack"
,
Grade = 80
});
cr.Students.Add(
new StudentRecord
()
{
Id = 3,
Name =
"John"
,
Grade = 60
});
cr.Students.Add(
new StudentRecord
()
{
Id = 4,
Name =
"Lisa"
,
Grade = 67
});
cr.Students.Add(
new StudentRecord
()
{
Id = 5,
Name =
"Jill"
,
Grade = 94
});
return
cr;
}

private static ClassReport
MockSem2()
{
ClassReport cr = new ClassReport
()
{
Id = 1,
Name =
"First Grade"
,
SemesterId = 2
};

// add dummy data - in real life, this comes from database
cr.Students.Add(
new StudentRecord
()
{
Id = 1,
Name =
"Ray"
,
Grade = 85
});
cr.Students.Add(
new StudentRecord
()
{
Id = 2,
Name =
"Jack"
,
Grade = 85
});
cr.Students.Add(
new StudentRecord
()
{
Id = 3,
Name =
"John"
,
Grade = 70
});
cr.Students.Add(
new StudentRecord
()
{
Id = 4,
Name =
"Lisa"
,
Grade = 82
});
cr.Students.Add(
new StudentRecord
()
{
Id = 5,
Name =
"Jill"
,
Grade = 96
});
return
cr;
}
}
}

Friday, March 22, 2013

Entity Framework Interview Question - Explain ENUM usage in EF5

Entity Framework 5 introduced support for Enum’s amongst other new features. This was a long awaited feature by the community. In this post, we will explore how enums work with entity framework, how can you code them, how to use them, how they are represented in the database, etc.

Question: Write code to demonstrate ENUM usage in Entity Framework.

For this exercise, please make sure that your project references at least version 5 of Entity Framework. You can use NuGet to add that to your project.

Model

In Entity Framework, an enumeration can have the following underlying types: Byte, Int16, Int32, Int64 , or SByte. I chose 3 different enum types to highlight the differences in database representation for different enum types.
    public enum DefaultEnum
    {
        Red = 1,
        Blue = 2,
        Green = 3
    }

    public enum ByteEnum : byte
    {
        RedByte,
        BlueByte,
        GreenByte
    }

    public enum ShortEnum : short
    {
        RedShort,
        BlueShort,
        GreenShort
    }

Now, let’s define our Account class that uses these enums.
    public class EnumDemoAccount
    {
        [Key]
        [DatabaseGeneratedAttribute(DatabaseGeneratedOption.Identity)]
        public int Id { get; set; }
        public string Name { get; set; }

        // our enum fields
        public DefaultEnum DefaultEnum { get; set; }
        public ByteEnum ByteEnum { get; set; }
        public ShortEnum ShortEnum { get; set; }
    }

As you can note, I am generating the Identity column as auto-incrementing (bonus).

Context


Our Context definition is pretty standard for this example. All I am telling the system is to create a database with the name I have given (instead of it choosing one based on namespace + class.
    public class AccountContext : DbContext
    {
        public DbSet<EnumDemoAccount> EnumDemoAccounts { get; set; }

        public AccountContext() : base("EFDemoDb")
        {}
    }

I also like to like to create a context initializer for test projects so that I always have a clean database for each run:
    public class DatabaseContextInitializer : 
        DropCreateDatabaseAlways<AccountContext>
    {
        protected override void Seed(AccountContext dbContext)
        {
            base.Seed(dbContext);
        }
    }

Test code


The test code is simple as well. We first set our initialization strategy, create a context, add a few records and see how things look.
        static void Main(string[] args)
        {
            // initialization 
            Database.SetInitializer<AccountContext>
                (new DatabaseContextInitializer());

            // create a new context
            AccountContext ac = new AccountContext();

            // add a couple of test entries
            ac.EnumDemoAccounts.Add(
                new EnumDemoAccount()
                {
                    Name = "First",
                    DefaultEnum = DefaultEnum.Blue,
                    ByteEnum = ByteEnum.GreenByte,
                    ShortEnum = ShortEnum.RedShort
                });

            ac.EnumDemoAccounts.Add(
                new EnumDemoAccount()
                {
                    Name = "Second",
                    DefaultEnum = DefaultEnum.Green,
                    ByteEnum = ByteEnum.RedByte,
                    ShortEnum = ShortEnum.BlueShort
                });

            // save to db
            ac.SaveChanges();

            // display
            foreach (var account in ac.EnumDemoAccounts)
            {
                Console.WriteLine(
                    "Name: {0}\n\tDefaultEnum:{1}\n\tByteEnum:{2} \n\tShortEnum: {3}",
                    account.Name,
                    account.DefaultEnum,
                    account.ByteEnum,
                    account.ShortEnum);
            }

            Console.ReadKey();
        }

Output


Now let’s review the output we got:

EnumDemo-output

As you can see, the enum values survived the round trip to the database. Wonderful! And they are all strongly typed so we can use them in our regular programming.

Database


Understanding what happened on the database side is equally important. Remember, we created 3 different enum types in our model – int, short and byte. Does the data type of the enum make any difference in the storage scheme? You bet! Let’s see how Entity Framework internally represents these types to the database:

EnumDemo-db-table

Note that the default enum data type (int) remained as int. Byte was translated as tinyint and Short as smallint. So, if you are using enums, think hard about your use cases and choose the appropriate datatype for your enums.

And finally, to round things up, let’s see what data values are stored in our database:

EnumDemo-db-query-output

No surprises here. Auto generated identity column, proper values for our enums. Things look good. Hoping that this post has given you a good starting point about enums in Entity Framework 5.

Thursday, March 21, 2013

Entity Framework Interview Question – what are the different ways to configure database name?

Entity Framework allows you to create a model (either using code-first approach or modelling tools) which can target an existing database or create a new one. Generally, you would extend DbContext and the derived class would call one of the base constructors of base class to figure out how to connect to the database. The .Net framework now increasingly favors and supports convention over configuration and this is quite evident in how DbContext discovers the database. We will cover a few of the approaches in how a code-first model discovers the database.

Connection by Convention
Let's examine the following code: 

namespace MvcApplication1.Models
{

    public class UsersContext : DbContext
    {
        public UsersContext()
        {
        }

        public DbSet<UserProfile> UserProfiles { get; set; }
    }

}

In the example shown above, if you have not done any other configuration in your application, then calling the parameterless constructor on DbContext will cause DbContext to run in Code First mode with a database connection created by convention. In this example DbContext uses the namespace qualified name of your derived context class — MvcApplication1.Models.UsersContext — as the database name and creates a connection string for this database using either SQL Express or LocalDb as shown below:









Connection by Convention with database name specified
If you do not like the name given by default, you can also specify what database name to use as an argument to the base class constructor as shown in the code below:

    public class UsersContext : DbContext
    {
        public UsersContext() : base("SuperCoolAppDatabase")
        {
        }

        public DbSet<UserProfile> UserProfiles { get; set; }
    }

This results in a database being created (or used if existing) with a more cleaner name as shown:








Connection by Convention withconnection string define in app/web config
If you do not like to code your database name in code, you can more easily specify it in the config file for your application. This is also very useful if you would be using something other than SQL Express or LocalDb as your database; e.g. SQL Server, SQL CE, etc. 

Follow the following convention to add a connect string entry in your web.config or app.config:


  <connectionStrings>
    <add name="MyAppConnStr" 
         providerName="System.Data.SqlServerCe.4.0"
         connectionString="Data Source=Blogging.sdf"/>
  </connectionStrings>

Now that you have defined your named connection string, there are two ways in which this can be discovered by the context:
1. If the name (in this case MyAppConnStr) matches the name of the context (with or without the namespace) it will be used. In our example above, this is not true (unless we named the connect string to be UsersContext).
2. The connection string name can be passed as an argument to the base constructor of your context as shown in the code below:

    public class UsersContext : DbContext
    {
        public UsersContext()
            : base("MyAppConnStr")
        {
        }

        public DbSet<UserProfile> UserProfiles { get; set; }
    }

 Since the system is flexible enough to use the name passed into DbContext constructor to determine whether it is a connection string name or database name, a good recommendation is to explicitly pass it as name="your conn string name". That makes the definition and intent clear to all and leaves no room for ambiguity. Also, an exception will be thrown if a connection string with the given name is not found.

    public class UsersContext : DbContext
    {
        public UsersContext()
            : base("name=MyAppConnStr")
        {
        }

        public DbSet<UserProfile> UserProfiles { get; set; }
    }

To recap, in this post, we explored different means to allow Entity Framework to discover which database to use using the code-first model.

Wednesday, March 20, 2013

SQL read text from a file

Consider the following scenario: Your database has a table called Customer (who doesn't these days) with millions of rows (lucky you). You production support team just called you in the middle of the night with a live site issue - Around 10,000 customers are getting a message that their logins have been disabled. And they send you a text file with 10,000 customer id's :-)

Great! Now you need to look these customers up in the database and enable their access again. 

So how can you load these 10K values from a text file into SQL? One way would be to edit the text file, replace each new line with a comma and then use the resulting string in the select clause as shown:

select * from Customer where Id in (1, 2, 6, 8, 10)   

Now this might prove to be too much non-techy way. You are a programmer, right? You don't hard-code stuff! You write code!!

So how can you load these disabled customer list in the database programatically?

Create a temporary table and call it DisabledCustomers with just one column (name it Id and make sure it allows nulls and is not primary key - just in case your list from support has duplicates).  

Run the following SQL to load the values from the file into this table.  
BULK INSERT dbo.DisabledCustomers
   FROM 'c:\temp\customerids.txt'
   WITH
      (
         ROWTERMINATOR ='\n'
      )
 

Now your select statement becomes quite simple: 
select * from Customer c
inner join DisabledCustomers dc on c.Id = dc.Id
 

Many production systems have BULK INSERT disabled? As an exercise to you, my dear reader, I look forward to seeing some innovative solutions to that.

SQL self join or sub-query interview question (employee-manager salary)

One of my favorite interview questions that tips even seasoned SQL guys (maybe because it's too simple) is around querying data that involves a self join.

What is a self join?
A self join is a join in which a table is joined with itself, specially when the table has a foreign  key which references its own primary key. 

Question: Given an Employee table which has 3 fields - Id (Primary key), Salary and Manager Id, where manager id is the id of the employee that manages the current employee, find all employees that make more than their manager in terms of salary. Bonus: Write the table creation script.

Friday, March 15, 2013

Data migration strategies and design patterns

Data migration is an extremely common operation in software design and development. Whenever a new system is introduced or a legacy system is redesigned, existing data has to be moved from the legacy system to the new target system.

From an interview perspective, you should know and understand the following well understood design patterns and guidelines that make the process of data migration smoother, faster and accurate with least amount of surprises during go-live.

Tuesday, January 22, 2013

SOA interview questions

In this post we will review a few key aspects that you should understand about SOA based services. In my experience, there is no substitute for actual experience developing SOA based services, but this introduction should get you enough to start.


What is SOA?
SOA is an architecture for building applications using reusable, interoperable services which have well defined business functionalities and can be orchestrated to achieve a specific functionality by utilizing them together.

4 facts you should know about SOA Services

Monday, January 21, 2013

Distributed vs Parallel computing


In this post I will provide a very high level overview of Distributed versus Parallel computing.

Distributed computing refers to the study of distributed systems to solve complex or time consuming problems, broken down to small tasks, across multiple computers (nodes) each of which has its own memory and disk.

Saturday, January 19, 2013

How to Boost your Self-Confidence

The importance of confidence in everything we do cannot be underemphasized. Right or wrong, people and providence treat you in a manner that you allow. But how do you gain self confidence? A task seemingly impossible to some and for others it is as easy as getting up in the morning ready for a bright new day. Confidence is not born in a man, but it is created by man. Anyone can have confidence. It is absolutely easier than you once have believed.

Tips to boost your Self-Confidence

Wednesday, January 16, 2013

Design and Architecture interview questions with answers

Urmila Singhal has written very detailed and interesting posts on how to interview for a technical program manager role.

Interviewing to become a Program Manager from a different role: Introduction to what are the skills needed to become a Program Manager - http://www.urmilasinghal.com/2012/11/interviewing-to-become-program-manager.html

Is project management like parenting? - Interviewing for a program manager Part 2: What is Project Management, what are the different methodologies of Project Management - http://www.urmilasinghal.com/2012/11/is-project-management-like-parenting.html

Communication with your grandma and the technical architect at the same time - Interviewing for a program manager Part 3: What is communication, its importance. Principles of communication and Different types of communication - http://www.urmilasinghal.com/2012/11/communication-with-your-grandma-and.html

Approaching design questions in an interview - Interviewing for a program manager Part 4: What is design and the types of design questions - http://www.urmilasinghal.com/2012/11/approaching-design-questions-in.html

Abstract Design Questions Approach - Interviewing for a program manager Part 5: Approach to abstract design problems with examples - http://www.urmilasinghal.com/2012/12/abstract-design-questions-approach.html

Architectural and Algorithmic design questions - Interviewing for a program manager Part 6: Example of an architectural and algorithmic design problem with a solution - http://www.urmilasinghal.com/2013/01/architectural-and-algorithmic-design.html